The following tables a开发者_如何学运维re involved:
Table Product:
product_id
merged_product_id
product_name
Table Company_Product:
product_id
company_id
(Company_Product has a primary key on both the product_id and company_id columns)
I now want to run an update on Company_Product to set the product_id column to a merged_ product_id. This update could cause duplicates which would trigger a primary key violation, so therefore I added a 'not exists' check in the where clause and my query looks like this:
update cp
set cp.product_id = p.merged_product_id
from Company_Product cp
join Product p on p.product_id = cp.product_id
where p.merged_product_id is not null
and not exists
(select * from Company_Product cp2
where cp2.company_id = cp.company_id and
cp2.product_id = p.merged_product_id)
But this query fails with a primary key violation.
What I think might happen is that because the Product table contains multiple rows with the same merged_product_id, it will succeed the for the first product, but when going to the next product with the same merged_product_id, it'll fail because the 'not exists' subquery does not see the first change, as the query has not finished and committed yet.
Am I right in thinking this, and how would I change the query to make it work?
[EDIT] Some data examples:
Product:
product_id merged_product_id
23 35
24 35
25 12
26 35
27 NULL
Company_Product:
product_id company_id
23 2
24 2
25 2
26 3
27 4
[EDIT 2] Eventually I went with this solution, which uses a temporary table to to the update on and then inserts the updated data into the original Company_Product table:
create table #Company_Product
(product_id int, company_id int)
insert #Company_Product select * from Company_Product
update cp
set cp.product_id = p.merged_product_id
from #Company_Product cp
join Product p on p.product_id = cp.product_id
where p.merged_product_id is not null
delete from Company_Product
insert Company_Product select distinct * from #Company_Product
drop table #Company_Product
A primary key is supposed to be three things:
- Non-null
- Unique
- Unchanging
By altering part of the primary key you're violating requirement #3.
I think you'd be better off creating a new table, populating it, then drop the constraints, drop the original table, and rename the new table to the desired name (then of course, re-apply the original constraints). In my experience this gives you the chance to check out the 'new' data before making it 'live'.
Share and enjoy.
You can use MERGE if you are on SQL 2008 at least.
Otherwise you're going to have to choose a criteria to establish which merged_product_id you want in and which one you leave out:
update cp
set cp.product_id = p.merged_product_id
from Company_Product cp
cross apply (
select top(1) merged_product_id
from Product
where product_id = cp.product_id
and p.merged_product_id is not null
and not exists (
select * from Company_Product cp2
where cp2.company_id = cp.company_id and
cp2.product_id = merged_product_id)
order by <insert diferentiating criteria here>) as p
Note that this is not safe if multiple concurrent requests are running the merge logic.
I can't quite see how your structure is meant to work or what this update is trying to achieve. You seem to be updating Company_Product
and setting a (new) product_id
on an existing row that apparently has a different product_id
; e.g., changing the row from one product to another. This seems...an odd use case, I'd expect you to be inserting a new unique row. So I think I'm missing something.
If you're converting Company_Product
to using a new set of product IDs instead of an old set (the name "merged_product_id" makes me speculate this), are you sure that there is no overlap between the old and new? That would cause a problem like what you're describing.
Without seeing your data, I believe your analysis is correct - the entire set is updated and then the commit fails since it results in a constraint violation. An EXISTS is never re-evaluated after "partial commit" of some of the UPDATE.
I think you need to more precisely define your rules regarding attempting to change multiple products to the same product according to the merged_product_id and then make those explicit in your query. For instance, you could exclude any products which would fall into that category with a further NOT EXISTS with appropriate query.
I think you are correct on why the update is failing. To fix this, run a delete query on your company_product table to remove the extra product_ids where the same merged_prduct_id will be applied.
here is a stab at what the query might be
delete company_product
where product_id not in (
select min(product_id)
from product
group by merged_product_id
)
and product_id not in (
select product_id
from product
where merged_product_id is null
)
-- Explanation added in resonse to comment --
What this tries to do is to delete rows that will be duplicates after the update. Since you have products with multiple merged ids, you really only need one of those products (for each company) in the table when you are done. So, my query (if it works...) will keep the min original product id for each merged product id - then your update will work.
So, let's say you have 3 product ids which will map to 2 merged ids: 1 -> 10, 2 -> 20, 3 -> 20. And you have the following company_product data:
product_id company_id
1 A
2 A
3 A
If you run your update against this, it will try to change both the second and third rows to product id 20, and it will fail. If you run the delete I suggest, it will remove the third row. After the delete and the update, the table will look like this:
product_id company_id
10 A
20 A
Try this:
create table #Company_Product
(product_id int, company_id int)
create table #Product (product_id int,merged_product_id int)
insert into #Company_Product
select 23, 2
union all select 24, 2
union all select 25, 2
union all select 26, 3
union all select 27, 4
insert into #product
Select 23, 35
union all select 24, 35
union all select 25, 12
union all select 26, 35
union all select 27, NULL
update cp
set product_id = merged_product_id
from #company_product cp
join
(
select min(product_id) as product_id, merged_product_id
from #product where merged_product_id is not null
group by merged_product_id
) a on a.product_id = cp.product_id
delete cp
--select *
from #company_product cp
join #product p on cp.product_id = p.product_id
where cp.product_id <> p.merged_product_id
and p.merged_product_id is not null
精彩评论