Edit: My question is, "Why does my first code example work?" Please read on...
Edit1: There is no doubt that a unique constraint is the correct 开发者_如何转开发way to ensure duplicates don't happen. This is a given. However, sometime we need to know that we're attempting a duplicate entry. Further, this post goes beyond merely handling duplicates.
This is potentially a repeat of probably over 100 questions on SO. I've read an endless array of confusions and contradictions about the simple questions of atomic updates, locks, and concurrency.
I see that blogs and experts disagree widely on these points. Here I provide test code based on the various solutions people have advised, indicate the results, state my views, and invite your comment.
Context: I'm running SQL Server 2008 Express SP2.
I created the following test table:
create table dbo.Temp (Col int)
The table deliberately has no constraints on it as we want to test the SQL code ideas, not constraints.
I ran the following concurrently in 2 and then 3 query windows:
declare @i int
set @i = 0
while @i < 5000 begin
set @i = @i + 1
update dbo.temp set Col = (SELECT Col from dbo.Temp) + 1
end
I did not use any explicit locking, as can be seen. All DB settings are default. I checked the value of Col and it was the desired number: 25,000. Nothing missed.
Since SQL Server is ACID, the "A" tells us that a single statement is executed atomically. Therefore, based on the above, we could agree with those who say that locks are not needed for a simple update as above.
Next, I ran the following concurrently in 3 query windows:
while @i < 5000 begin
set @i = @i + 1
insert into dbo.temp select @i where not exists
(select 1 from dbo.temp where Col = @i)
end
The results were not correct, despite the fact this is one statement. There were missing values, duplicate values, and > 5k rows.
Next, I ran the following popular solution concurrently in 3 query windows:
declare @i int
set @i = 0
while @i < 5000 begin
set @i = @i + 1
insert into dbo.temp select @i where not exists
(select 1 from dbo.temp with (updlock) where Col = @i)
end
The results were not correct. There were missing values, duplicate values, and > 5k rows.
Next, for those who have doubts that SQL Server implicitly wraps single statements in a transaction (aka "A" in ACID):
declare @i int
set @i = 0
while @i < 5000 begin
set @i = @i + 1
begin tran
insert into dbo.temp select @i where not exists
(select 1 from dbo.temp with (updlock) where Col = @i)
commit tran
end
Same incorrect results.
Next, I ran the following concurrently in 3 query windows:
declare @i int
set @i = 0
while @i < 5000 begin
set @i = @i + 1
insert into dbo.temp select @i where not exists
(select 1 from dbo.temp with (XLOCK, ROWLOCK) where Col = @i)
end
This worked. 5k unique values only.
Next, the following in 3 windows:
declare @i int
set @i = 0
while @i < 5000 begin
set @i = @i + 1
merge dbo.temp as t
using (select @i) as test (Col)
ON (t.Col = test.Col)
when not matched then
insert values (@i);
end
This worked. 5k unique values only.
My conclusions are:
- SQL Server is atomic for single SQL operations, not groups of operations that appear as one statement.
- UPDLOCK is not effective, as shown and as suggested by many, at providing the required locking needed to guarantee update integrity in a contentious environment.
- From a locking perspective, only XLOCK guarantees concurrency integrity and atomicity in multi-statement operations as shown.
- The MERGE command is a single command, and therefore is atomic.
Please test this for yourself.
Now, can someone explain to me why my first example worked? :-)
精彩评论