We currently have an issue that occurs roughly once a day on SQL 2005 database server, although the time it happens is not consistent.
Basically, the database grinds to a halt, and starts refusing connections with the following error message. This includes logging into SSMS:
A connection was successfully established with the server, but then an error occurred during the login process. (provider: TCP Provider, error: 0 - The specified network name is no longer available.)
Our CPU usage for SQL is usually 开发者_开发技巧around 15%, but when the DB is in it's broken state it's around 70%, so it's clearly doing something, even if no-one can connect. Even if I disable the web app that uses the database the CPU still doesn't go down.
I am unable to restart the SQLSERVER process as it is unresponsive, so I have to end up killing the process manually, which then puts the DB into Suspect/Recovery mode (which I can fix but it's a pain).
Below are some PerfMon stats I gathered when the DB was in it's broken state which might help. I have a bunch more if people want to request them:
- Active Transactions: 2 (Never
- Changes) Logical Connections: 34 (NC)
- Process Blocked: 16 (NC) User
- Connections: 30 (NC) Batch Request: 0
- (NC) Active Jobs: 2 (NC) Log
- Truncations: 596 (NC) Log Shrinks: 24
- (NC) Longest Running Transaction
- Time: 99 (NC)
I guess they key is finding out what the DB is using it's CPU on, but as I can't even log into SSMS this isn't possible with the standard methods.
Disturbingly, I can't even use the dedicated admin connection to get into SSMS. I get the same timout as with all other requests.
Any advice, reccomendations, or even sympathy, is much appreciated!
You will need to use the Profiler to determine what queries and process(es) may be causing this.
Whilst it is blocking normal connections, you might want to try going in under the Dedicated Admin Console connection. You will need to be in the sysadmin role of the database server to achieve this, in SSMS when you specify the server name prefix it with 'admin:' - this uses a different connection which is less likely to be blocked (but not impossible, just takes extreme circumstances).
You shouldn't use this DAC by default, you get access to the system tables and various other items you can not see normally, so you can also do a lot of damage with it.
Once in, you have a normal query window and can start looking at what is running, what's locked etc.
The dedicated admin connection is meant to help in these situations
Then, this script can tell you what has the open tran and the SQL running
SELECT s_tst.[session_id],
s_es.[login_name] AS [Login Name],
S_tdt.[database_transaction_begin_time] AS [Begin Time],
s_tdt.[database_transaction_log_record_count] AS [Log Records],
s_tdt.[database_transaction_log_bytes_used] AS [Log Bytes],
s_tdt.[database_transaction_log_bytes_reserved] AS [Log Reserved],
s_est.[text] AS [Last T-SQL Text],
s_eqp.[query_plan] AS [Last Query Plan]
FROM sys.dm_tran_database_transactions s_tdt
JOIN sys.dm_tran_session_transactions s_tst
ON s_tst.[transaction_id] = s_tdt.[transaction_id]
JOIN sys.[dm_exec_sessions] s_es
ON s_es.[session_id] = s_tst.[session_id]
JOIN sys.dm_exec_connections s_ec
ON s_ec.[session_id] = s_tst.[session_id]
LEFT OUTER JOIN sys.dm_exec_requests s_er
ON s_er.[session_id] = s_tst.[session_id]
CROSS APPLY sys.dm_exec_sql_text (s_ec.[most_recent_sql_handle]) AS s_est
OUTER APPLY sys.dm_exec_query_plan (s_er.[plan_handle]) AS s_eqp
ORDER BY [Begin Time] ASC;
Finally, SQL Server 2005 has a default trace running: you may be able to use this to find out what went wrong
精彩评论