1. Executive Summary
Your server experienced connection timeouts because the number of active database connections (1,745) exceeded the Linux operating system's default Open File Limit (ulimit) of 1,024. When this limit is reached, the OS prevents the SQL Server service from opening new network sockets, causing all subsequent connection attempts to hang or time out.
2. Diagnostics (How to verify)
To confirm if you are hitting these limits in the future, use the following commands:
-
Check Active Connections (OS Level):
netstat -an | grep 1433 | wc -l(A high count, e.g., >1000, indicates a leak). -
Check Connection States:
netstat -an | grep 1433 | awk '{print $6}' | sort | uniq -c(High "ESTABLISHED" counts suggest application-level leaks). -
Check OS Limits:
ulimit -Sn(The "Soft Limit" is the ceiling that causes the timeout).
3. Immediate Resolution (Increasing Limits)
To allow MSSQL Enterprise to utilize its full capacity, you must increase the file descriptor limit for the service.
-
Modify the Service Configuration:
Bash
sudo systemctl edit mssql-server -
Add the following configuration:
Ini, TOML
[Service] LimitNOFILE=64000 -
Apply Changes:
Bash
sudo systemctl daemon-reload sudo systemctl restart mssql-serverThis increases the ceiling from 1,024 to 64,000, preventing immediate timeouts.
4. Root Cause: Addressing Connection Leaks
Increasing the limit is a temporary fix. 1,745 connections is abnormal for most workloads and indicates that the application is not closing connections properly.
Recommended Actions:
-
Identify the Source IP: Use
ss -atp | grep :1433to find which remote server is opening excessive connections. -
Code Review: Ensure all database connections in your application code use "Dispose," "Close," or "Using" blocks.
-
SQL Cleanup: Run the following query to identify "Sleeping" sessions that have been idle for a long time:
SQL
SELECT session_id, login_name, host_name, status FROM sys.dm_exec_sessions WHERE status = 'sleeping' AND is_user_process = 1;If sessions are 'sleeping' but not 'running', the application has forgotten to close them.
5. Enterprise Best Practices
-
Monitor Threadpool: In MSSQL Enterprise, monitor
sys.dm_os_schedulersto ensureactive_workers_countisn't reaching its limit, which can happen if connections continue to climb. -
Connection Pooling: Adjust your application's connection string to set a
Max Pool Size(default is 100) to prevent a single application instance from overwhelming the server.