CommandTimeout – How to handle it properly?
ADO.Net Data Provider for SQL Server (System.Data.SqlClient namespace in System.Data.dll assembly) has two common classes we often use: SqlConnection and SqlCommand. SqlCommand class has CommandTimeout property (do not mix it with the ConnectionTimeout which stops long login process). But, do we handle it properly? And what exactly is happening when the command timeout occurs?
If we ommit setting CommandTimeout property, the default of 30 seconds will be used. 0 means wait indefinitely, no timeout will be raised. When timeout occurs, execution of the TSQL stops immediately. It is untrappable which means execution is NOT transferred to the “BEGIN CATCH” TSQL block. Your COMMIT command is also NOT executed if timeout occurred before it is reached (and often is). Timeout can be seen as a sql trace event of a class “Attention”:
Bob Dorr, Microsoft’s Senior Escalation Engineer has blogged about Attention event, so let me quote him:
At the time of the attention the transaction is not rolled back unless transaction abort (XACT_ABORT) has been enabled. The client needs to submit a rollback. If the client does not submit the rollback and continues other processing the transaction remains open and the behavior scope of the application becomes undefined.
CommandTimeout will only STOP TSQL execution, but the client application is responsible to COMMIT or ROLLBACK the pending transaction (if not already rolled back because of SET XACT_ABORT ON). If client app does not explicitly commit or rollback, the transaction remains open, holding all the locks, blocking other sessions until somebody kills that session! Sounds familiar? Do you have occasional strange blocking with the root blocker that is sleeping session?
The session will stay forever in “sleeping” status. You can do a “proper” try-catch Dispose of command, nicely Close/Dispose the connection or do it with a “using” construct. All that “proper closing” of the connection will not abort the transaction and not even end the sleeping session (because of connection pooling, until that same connection is reused). Transaction is still there, leaked, holding all the locks on a forever “sleeping” session! (sleeping beauty, only this one is rather “ugly”).
That “leaking transaction” behavior is the same on all SqlClient versions I tried (dotnet 2.0, 4.0, 4.5.2) and all SQL Server versions I tried (SQL 2008, 2008R2, 2014). Moreover, it is “by design”. It is “by desgin” to let developers decide what will they do with pending transactions. And we all handle those pending transactions with explicit commit/rollback in our dotnet app, don’t we? We have three choices to do with pending transaction when timeout exception is raised in dotnet code:
- commit it
- roll it back
- leave it opened forever holding all the locks (that is the default “choice” of most of us!)
Would it be nice that Microsoft had a property e.g. CommandTimeoutBehavior on a SqlCommand with those three choices? That would remove the burden from the developers and save lots and lots of hours of many DBAs.
Enough talk, let’s reproduce it and see how it looks like! If you run this script:
if object_id('test') is not null drop table test;
CREATE TABLE test(x int primary key);
CREATE procedure dbo.p_test as
truncate table test;
INSERT INTO test(x) values(1)
WAITFOR DELAY '00:00:8'
INSERT INTO test(x) values(2)
IF @@trancount>0 COMMIT TRAN
end try begin catch
IF @@trancount>0 ROLLBACK TRAN
…it will create a table and a procedure p_test in tempdb. Now, in SQL Server Management Studio (SSMS) set the command timeout to 5 seconds (Right click (context menu)->Connection->Change connection->Options):
And run the proc: exec dbo.p_test. After exactly 5 seconds, this error message will strike:
Msg -2, Level 11, State 0, Line 0
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
Class : 11
Number : -2
State : 0
Message : Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
The procedure was designed to begin a transaction, insert the row1, wait 8 seconds, insert the row2 and commit. But because CommandTimeout was set to 5 seconds, during that 8 second wait the timeout occured. Let’s check sessions with sp_WhoIsActive and the locks with sys.dm_tran_locks in a SEPARATE query editor window (will be explained later why):
That means, transaction has begun, row1 is inserted, but row2 is NOT, and transaction is never committed – locks are here to stay. Also, the session 57 is in “sleeping” status “forever” with an opened transaction. That is, until we kill that session, close that query editor window, or execute some other command in exactly that same session (that is why we opened the other window).
When command timeout occurs (based on the CommandTimeout value), if XACT_ABORT is OFF, the explicit transaction that was running will leak. That means, it wont be rolled back, it wont be committed. It will hold all the locks indefinitely (ok, not indefinitely, but for a very long time, until it is killed or somebody reuses exactly that same session). Dotnet application developer needs to handle the exception raised by timeout, and send another command through the same connection object with explicit COMMIT/ROLLBACK to finish the pending transaction.
CommandTimeout = leaking transaction and forever sleeping session (unless dotnet explicitly commits/rollbacks)
And how exactly to do this explicit commit/rollback in dotnet, I wrote in this blog post.