READ COMMITTED sucks!
Are you still using READ COMMITTED transaction isolation level? Default transaction isolation level on all SQL Server versions (2000-2014) has serious inconsistency problems, by design. Not many people are familiar how and why is it happening. I am writing this because one fellow MCM just had that problem, scratching his head how is it possible. Few days ago I met one experienced Oracle DBA, a friend (yes, I do have Oracle friends! :)), who was asking me “Does SQL Server still have consistency problems?” making me scratch my head on what does he meant by that. I will show you a demo code here to reproduce it and a solution to the problem.
How READ COMMITTED works?
Under default READ COMMITTED isolation level, shared locks on the rows/pages are released as soon as the row is read. Consequence are two anomalies at a single-statement level, that some call “inconsistent analysis”:
– reading the same row multiple times
– legitimate rows are omitted, although they are present in the table all the time
That are serious, statement-level inconsistencies. E.g. single simple SELECT can read the same row twice and bring you duplicates in the result or it will skip some rows. The worst thing is that you will often not notice at all that data is inconsistent.
If you are running a SELECT and the other transaction updates the row you already read in a way that it moves that row in front of your reading point, you will read that row twice. If the other transaction updates the row you are yet to be read in a way that it moves that row before of your reading point, you will not read that row at all. Such data anomalies are difficult to reproduce because the timing is crucial.
Let’s reproduce it!
Here is the script. You have to be fast with separate session because timing is crucial:
--==== SETUP - begin
create database test_artefacts
alter database test_artefacts set recovery simple
alter database test_artefacts set read_committed_snapshot off
alter database test_artefacts set allow_snapshot_isolation off
create table big_table
id int primary key clustered,
filler char(8000) default('a')
-- insert 500 000 rows. Or number large enough for selecting all rows from that table to take at least 5 seconds.
truncate table big_table
insert into big_table(id, filler)
select t.id, t.id
( select top 10000 id = ROW_NUMBER() over(order by (select null)) + isnull((SELECT max(id) from big_table), 0)
from sys.all_columns a cross join sys.all_columns b
if OBJECT_ID('tempdb..#t') is not null drop table #t
--select COUNT(*) from big_table -- 500 000
--==== SETUP - end
-- Start session 1, then session 2 very soon after (about 1 second or less)
--==== SESSION 1 - begin
-- Copy 500 000 rows table
select * into #t from big_table
--==== SESSION 1 - end
--==== SESSION 2 - begin
--Update first rows to move them at the end of the clustered index.
--We are NOT changing number of records. We do not do inserts nor deletes.
--Number of rows should stay constant.
update t set id=id+1000000
from (select top 1 * from dbo.big_table order by id) t
--==== SESSION 2 - end
select COUNT(*) from #t -- 500 123! We expected 500 000.
select COUNT(*) from big_table -- 500 000!
-- Run the same experiment again, but this time with this code in session 2:
update t set id=-id
from (select top 1 * from dbo.big_table order by id desc) t
select COUNT(*) from #t -- 499 877! We expected 500 000.
select COUNT(*) from big_table -- 500 000!
Basically, it creates a database and a 500 000 rows table. Then copies full table to a #temp table, while another session does updates. We would expect that copy of the table has the same number of rows as original table (500 000), since UPDATE command does not change the number of rows – all of the time there is exactly 500 000 rows in a table.
But – surprise!
In first example we copied 123 rows more (500 123), showing multiple reads of the same row.
In second example we copied 123 rows less (499 877), showing that some rows are skipped during copy.
The solution of this complex problem is really simple, and is available since SQL 2005. It is called READ COMMITTED SNAPSHOT. You can enable it at database level with a single ALTER DATABASE command. But it also has consequences, like tempdb growing and changed rows become wider by 14-byte pointer, possibly causing page splits if your fillfactor is 100%. So you probably want to test that before you enable it on existing systems. New databases should be safe to use it. On SQL 2000 your only option is to introduce more locks: REPEATABLE READ or SERIALIZABLE isolation level.