I want to do the SELECT / INSERT version of an UPSERT. Below is a template of the existing code:
// CREATE TABLE Table (RowID INT NOT NULL IDENTITY(1,1), RowValue VARCHAR(50))
IF NOT EXISTS (SELECT * FROM Table WHERE RowValue = @VALUE)
BEGIN
INSERT Table VALUES (@Value)
SELECT @id = SCOPEIDENTITY()
END
ELSE
SELECT @id = RowID FROM 开发者_开发技巧Table WHERE RowValue = @VALUE)
The query will be called from many concurrent sessions. My performance tests show that it will consistently throw primary key violations under a specific load.
Is there a high-concurrency method for this query that will allow it to maintain performance while still avoiding the insertion of data that already exists?
You can use LOCKs to make things SERIALIZABLE but this reduces concurrency. Why not try the common condition first ("mostly insert or mostly select") followed by safe handling of "remedial" action? That is, the "JFDI" pattern...
Mostly INSERTs expected (ball park 70-80%+):
Just try to insert. If it fails, the row has already been created. No need to worry about concurrency because the TRY/CATCH deals with duplicates for you.
BEGIN TRY
INSERT Table VALUES (@Value)
SELECT @id = SCOPE_IDENTITY()
END TRY
BEGIN CATCH
IF ERROR_NUMBER() <> 2627
RAISERROR etc
ELSE -- only error was a dupe insert so must already have a row to select
SELECT @id = RowID FROM Table WHERE RowValue = @VALUE
END CATCH
Mostly SELECTs:
Similar, but try to get data first. No data = INSERT needed. Again, if 2 concurrent calls try to INSERT because they both found the row missing the TRY/CATCH handles.
BEGIN TRY
SELECT @id = RowID FROM Table WHERE RowValue = @VALUE
IF @@ROWCOUNT = 0
BEGIN
INSERT Table VALUES (@Value)
SELECT @id = SCOPE_IDENTITY()
END
END TRY
BEGIN CATCH
IF ERROR_NUMBER() <> 2627
RAISERROR etc
ELSE
SELECT @id = RowID FROM Table WHERE RowValue = @VALUE
END CATCH
The 2nd one appear to repeat itself, but it's highly concurrent. Locks would achieve the same but at the expense of concurrency...
Edit:
Why not to use MERGE...
If you use the OUTPUT clause it will only return what is updated. So you need a dummy UPDATE to generate the INSERTED table for the OUTPUT clause. If you have to do dummy updates with many calls (as implied by OP) that is a lot of log writes just to be able to use MERGE.
// CREATE TABLE Table (RowID INT NOT NULL IDENTITY(1,1), RowValue VARCHAR(50))
-- be sure to have a non-clustered unique index on RowValue and RowID as your clustered index.
IF EXISTS (SELECT * FROM Table WHERE RowValue = @VALUE)
SELECT @id = RowID FROM Table WHERE RowValue = @VALUE
ELSE BEGIN
INSERT Table VALUES (@Value)
SELECT @id = SCOPEIDENTITY()
END
As always, gbn's answer is correct and ultimately lead me to where I needed to be. However, I found a particular edge case that wasn't covered by his approach. That being a 2601
error which identifies a Unique Index Violation
.
To compensate for this, I've modified his code as follow
...
declare @errornumber int = ERROR_NUMBER()
if @errornumber <> 2627 and @errornumber <> 2601
...
Hopefully this helps someone!
精彩评论