Insert if NOT exists – duplicate key problem

When multiple sessions insert into the same table, and something unique is there (uq index, uq constraint), a “duplicate key” error (2601, 2627) occurs from time to time, almost inevitable. If your app is affected with intermittent “duplicate key” surprises, please read on – we will scrutinize on how to avoid that error. If you are in a hurry – go for the summary at the bottom.

We want to avoid these errors, all of which are seen in these tests:

  • Error 2601, severity level 14:
  • Cannot insert duplicate key row in object ‘%.*ls’ with unique index ‘%.*ls’. The duplicate key value is %ls.
  • Error 2627, severity level 14:
  • Violation of %ls constraint ‘%.*ls’. Cannot insert duplicate key in object ‘%.*ls’. The duplicate key value is %ls.
  • Error 1205, severity level 13:
  • Transaction (Process ID %d) was deadlocked on %.*ls resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

The Setup

We want the test the blood out of SQL, we do not want nice examples here, we want errors to jump-out immediately if anything is wrong with the code. Therefore, we will have two sessions constantly colliding on unique index, because they are both inserting first available number – a perfect nasty “duplicate key” problem generator. Here is the code for our test table:

Approaches

There are several approaches one can take:

    1) IF NOT EXISTS, INSERT – two separate commands: check, then insert

We want check and insert to be as close as possible, almost atomic. Because if we crump many commands in between, we got ourselves a long transaction holding locks, and problems with blocking and deadlocks.

     2) INSERT INTO, WHERE NOT EXISTS

    3) MERGE INTO, WHEN NOT MATCHED BY TARGET THEN INSERT – we won’t do it since it should give the same result as INSERT-WHERE. Maybe in the future we can add that test if you show the interest for it.

    4) INSERT (plain) – allow “duplicate key” to happen, but catch it in a TRY-CATCH as nothing happened and retry until succeeded. We won’t do that because the error still happens, just hushed.

    5) IGNORE_DUP_KEY set on the unique index, to skip inserting duplicate rows with a warning – we won’t do it here because for a NON-CLUSTERED index it is a bad idea. Nasty range locks and bad execution plans occur.

    6) Increase isolation level to REPEATABLE READ / SERIALIZABLE. That creates lot of unnecessary locks and causes blocking issues – we want to avoid that, and use as minimal locks as possible.

Variations for test combinations

  • We will test mostly with RCSI ON which is recommended, but also with the sad default of RCSI OFF. Here is how to flip it:
  • HINTS can be added to lock the row upfront, earlier and more severe than without the hint. Various hint combinations are possible. Or we can be without a hint.
  • Two statement (if not exists-insert) or single statement (insert where not exists) approach.

There are lot of combinations and we will not test all of them, but only the most important ones. The ones that work, and the ones that are common and do not work.

The test

As mentioned, we will do a loop that tries to insert a value if it does not already exists. It should be run from two separate sessions in parallel. It produces an error very quickly since sessions constantly collide on each-other because they both want to insert the same value – a purposely nasty design for easier problem detection. We use a transaction to enable both commands in two-command approach to exist in the same transaction and locks survive from check till insert. Here is that loop:

Results

“Single cmd” means INSERT-WHERE NOT EXISTS, and “Two cmds” means IF NOT EXISTS-INSERT approach. These combinations fail:

Approach RCSI HINT Result
Single cmd OFF none “duplicate key” error
Single cmd OFF WITH(HOLDLOCK) deadlock (1205)
Single cmd ON none “duplicate key” error
Single cmd ON WITH(READCOMMITTEDLOCK) “duplicate key” error
Single cmd ON WITH(READCOMMITTEDLOCK, HOLDLOCK) “Conflicting locking hints” error
Two cmds ON none “duplicate key” error
Two cmds ON WITH(READCOMMITTEDLOCK) “duplicate key” error
Two cmds ON WITH(ROWLOCK, XLOCK) “duplicate key” error
Single cmd ON WITH(ROWLOCK, XLOCK) “duplicate key” error

And these are the ones that work, the “golden” ones:

Approach RCSI HINT Result
Two cmds ON WITH(ROWLOCK, XLOCK, HOLDLOCK) Works!
Single cmd ON WITH(ROWLOCK, XLOCK, HOLDLOCK) Works!
Single cmd ON WITH(XLOCK, HOLDLOCK) Works!
Single cmd ON WITH(UPDLOCK, HOLDLOCK) Works!
Two cmds ON WITH(UPDLOCK, HOLDLOCK) Works!

Summary

To avoid duplicate key error you will have to add some hints next to table when you check for existence. At least “WITH(UPDLOCK, HOLDLOCK)“. It works for both two-commands and single-command approach, even under RCSI! Eg, it looks like this:

I hope it is useful to you.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.