开发者

Update query with 'not exists' check causes primary key violation

开发者 https://www.devze.com 2022-12-16 19:59 出处:网络
The following tables a开发者_如何学运维re involved: Table Product: product_id merged_product_id product_name

The following tables a开发者_如何学运维re involved:

Table Product:
product_id
merged_product_id
product_name

Table Company_Product:
product_id
company_id

(Company_Product has a primary key on both the product_id and company_id columns)

I now want to run an update on Company_Product to set the product_id column to a merged_ product_id. This update could cause duplicates which would trigger a primary key violation, so therefore I added a 'not exists' check in the where clause and my query looks like this:

update cp
set cp.product_id = p.merged_product_id
from Company_Product cp
join Product p on p.product_id = cp.product_id
where p.merged_product_id is not null
and not exists 
 (select * from Company_Product cp2 
  where cp2.company_id = cp.company_id and 
  cp2.product_id = p.merged_product_id)

But this query fails with a primary key violation.

What I think might happen is that because the Product table contains multiple rows with the same merged_product_id, it will succeed the for the first product, but when going to the next product with the same merged_product_id, it'll fail because the 'not exists' subquery does not see the first change, as the query has not finished and committed yet.

Am I right in thinking this, and how would I change the query to make it work?

[EDIT] Some data examples:

Product:

product_id merged_product_id    
   23            35    
   24            35    
   25            12    
   26            35    
   27           NULL

Company_Product:

product_id company_id    
   23          2    
   24          2    
   25          2    
   26          3    
   27          4

[EDIT 2] Eventually I went with this solution, which uses a temporary table to to the update on and then inserts the updated data into the original Company_Product table:

create table #Company_Product
(product_id int, company_id int)

insert #Company_Product select * from Company_Product

update cp
set cp.product_id = p.merged_product_id
from #Company_Product cp
join Product p on p.product_id = cp.product_id
where p.merged_product_id is not null

delete from Company_Product

insert Company_Product select distinct * from #Company_Product

drop table #Company_Product


A primary key is supposed to be three things:

  1. Non-null
  2. Unique
  3. Unchanging

By altering part of the primary key you're violating requirement #3.

I think you'd be better off creating a new table, populating it, then drop the constraints, drop the original table, and rename the new table to the desired name (then of course, re-apply the original constraints). In my experience this gives you the chance to check out the 'new' data before making it 'live'.

Share and enjoy.


You can use MERGE if you are on SQL 2008 at least.

Otherwise you're going to have to choose a criteria to establish which merged_product_id you want in and which one you leave out:

update cp
set cp.product_id = p.merged_product_id
from Company_Product cp
cross apply (
  select top(1) merged_product_id
  from Product 
  where product_id = cp.product_id
  and p.merged_product_id is not null
  and not exists (
    select * from Company_Product cp2 
    where cp2.company_id = cp.company_id and 
    cp2.product_id = merged_product_id)
  order by <insert diferentiating criteria here>) as p

Note that this is not safe if multiple concurrent requests are running the merge logic.


I can't quite see how your structure is meant to work or what this update is trying to achieve. You seem to be updating Company_Product and setting a (new) product_id on an existing row that apparently has a different product_id; e.g., changing the row from one product to another. This seems...an odd use case, I'd expect you to be inserting a new unique row. So I think I'm missing something.

If you're converting Company_Product to using a new set of product IDs instead of an old set (the name "merged_product_id" makes me speculate this), are you sure that there is no overlap between the old and new? That would cause a problem like what you're describing.


Without seeing your data, I believe your analysis is correct - the entire set is updated and then the commit fails since it results in a constraint violation. An EXISTS is never re-evaluated after "partial commit" of some of the UPDATE.

I think you need to more precisely define your rules regarding attempting to change multiple products to the same product according to the merged_product_id and then make those explicit in your query. For instance, you could exclude any products which would fall into that category with a further NOT EXISTS with appropriate query.


I think you are correct on why the update is failing. To fix this, run a delete query on your company_product table to remove the extra product_ids where the same merged_prduct_id will be applied.

here is a stab at what the query might be

delete company_product
  where product_id not in (
    select min(product_id)
      from product
      group by merged_product_id
  )
  and product_id not in (
    select product_id
      from product
      where merged_product_id is null
  )

-- Explanation added in resonse to comment --

What this tries to do is to delete rows that will be duplicates after the update. Since you have products with multiple merged ids, you really only need one of those products (for each company) in the table when you are done. So, my query (if it works...) will keep the min original product id for each merged product id - then your update will work.

So, let's say you have 3 product ids which will map to 2 merged ids: 1 -> 10, 2 -> 20, 3 -> 20. And you have the following company_product data:

product_id  company_id
1           A
2           A
3           A

If you run your update against this, it will try to change both the second and third rows to product id 20, and it will fail. If you run the delete I suggest, it will remove the third row. After the delete and the update, the table will look like this:

product_id  company_id
10          A
20          A


Try this:

create table #Company_Product
(product_id int, company_id int)
create table #Product (product_id int,merged_product_id int)
insert into #Company_Product
select           23, 2     
union all select 24, 2     
union all select 25, 2     
union all select 26, 3     
union all select 27, 4
insert into #product 
Select              23, 35     
union all select    24, 35     
union all select    25, 12     
union all select    26, 35     
union all select   27, NULL 

update cp 
set product_id = merged_product_id
from #company_product cp
join
  ( 
    select min(product_id) as product_id, merged_product_id  
      from #product where merged_product_id is not null
      group by merged_product_id 
  ) a on a.product_id = cp.product_id

delete cp 
--select *
from #company_product cp
join #product p on cp.product_id = p.product_id
where cp.product_id <> p.merged_product_id
and p.merged_product_id is not null
0

精彩评论

暂无评论...
验证码 换一张
取 消