I have a table with data about a customer, Customer(name, address), with rows like "John Doe", "Some Street 123". For each row in the table, I want to insert one row in the Person(id, name) table and also one row in the Address(id, person_id, address) table.
I can accomplish this by running two insert statements for each row in Customer:
insert into Person(name) values (@name);
insert into Address(person_id, address) values (scope_identity(), @address);
But this is inefficient. I want to do the inserts in a batch, kind of like this:
-- This works, the problem is with the Address table...
insert into Person(name)
select name from Customer
-- This looks good but does not work because name is not unique.
insert into Address(person_id, address)
select p.person_id, c.address
from Custome开发者_如何学运维r c join Person p on c.name = p.name
Leaving this here for the fellow Google traveler that finds this post like me.
I found this solution, and it seems to work great, and doesn't require any funky schema alterations: https://dba.stackexchange.com/questions/160210/splitting-data-into-two-tables-in-one-go
They use a MERGE
statement to perform the initial insert into the first table (the table that is generating the identity to be used everywhere else). The reason it uses the MERGE
statement is because it allows you to use an OUTPUT
statement, which you can use to output both the new identity value as well as the identity value from the source table (as opposed to using an OUTPUT
statement on a standard INSERT
which does not allow you to output the source tables identity). You can insert this output data into a mapping table, and use that mapping table to perform the second insert.
Here's my sample code for the solution:
------------------------------------------------------------------------------
------------------------------------------------------------------------------
-- Set up sample schema and data
------------------------------------------------------------------------------
--Source Data
IF OBJECT_ID('dbo.tmp1') IS NOT NULL DROP TABLE dbo.tmp1 --SELECT * FROM dbo.tmp1
CREATE TABLE dbo.tmp1 (tmp1ID INT IDENTITY(1,1), Col1 CHAR(1) NOT NULL, Col2 CHAR(1) NOT NULL, Col3 CHAR(1) NOT NULL, Col4 CHAR(1) NOT NULL, Col5 CHAR(1) NOT NULL, Col6 CHAR(1) NOT NULL)
INSERT INTO dbo.tmp1 (Col1, Col2, Col3, Col4, Col5, Col6)
SELECT x.c1, x.c2, x.c3, x.c4, x.c5, x.c6
FROM (VALUES ('A','B','C','D','E','F'),
('G','H','I','J','K','L'),
('M','N','O','P','Q','R')
) x(c1,c2,c3,c4,c5,c6)
IF OBJECT_ID('dbo.tmp3') IS NOT NULL DROP TABLE dbo.tmp3 --SELECT * FROM dbo.tmp3
IF OBJECT_ID('dbo.tmp2') IS NOT NULL DROP TABLE dbo.tmp2 --SELECT * FROM dbo.tmp2
--Taget tables to split into
CREATE TABLE dbo.tmp2 (
tmp2ID INT IDENTITY(1,1) NOT NULL CONSTRAINT PK_tmp2 PRIMARY KEY CLUSTERED (tmp2ID ASC)
, Col1 CHAR(1) NOT NULL
, Col2 CHAR(1) NOT NULL
, Col3 CHAR(1) NOT NULL
)
CREATE TABLE dbo.tmp3 (
tmp2ID INT NOT NULL
, Col4 CHAR(1) NOT NULL
, Col5 CHAR(1) NOT NULL
, Col6 CHAR(1) NOT NULL
, CONSTRAINT FK_tmp3_tmp2ID FOREIGN KEY(tmp2ID) REFERENCES dbo.tmp2 (tmp2ID)
)
------------------------------------------------------------------------------
------------------------------------------------------------------------------
-- Split data into two tables
------------------------------------------------------------------------------
DECLARE @Mapping TABLE (tmp1ID INT NOT NULL, tmp2ID INT NOT NULL);
--Use merge statment to output the source data PK as well as the newly inserted identity to generate a mapping table
MERGE INTO dbo.tmp2 AS tgt
USING dbo.tmp1 AS src ON (1=0)
WHEN NOT MATCHED THEN
INSERT ( Col1, Col2, Col3)
VALUES (src.Col1, src.Col2, src.Col3)
OUTPUT src.tmp1ID, Inserted.tmp2ID INTO @Mapping (tmp1ID, tmp2ID);
--Use the mapping table to insert the split data into the second table
INSERT INTO dbo.tmp3 (tmp2ID, Col4, Col5, Col6)
SELECT t2.tmp2ID, t1.Col4, t1.Col5, t1.Col6
FROM dbo.tmp2 t2
JOIN @Mapping m ON m.tmp2ID = t2.tmp2ID
JOIN dbo.tmp1 t1 ON t1.tmp1ID = m.tmp1ID
SELECT tmp2ID, Col1, Col2, Col3 FROM dbo.tmp2
SELECT tmp2ID, Col4, Col5, Col6 FROM dbo.tmp3
------------------------------------------------------------------------------
------------------------------------------------------------------------------
-- Clean up
------------------------------------------------------------------------------
DROP TABLE dbo.tmp1
DROP TABLE dbo.tmp3
DROP TABLE dbo.tmp2
------------------------------------------------------------------------------
------------------------------------------------------------------------------
GO
there is no way to do this as you explain because you lost scope_identity() value of each row of first insert.
A work around may be add Customer primary key fields to Person table and then make join of second insert with this fields:
before insert create customerID field on Person
alter table Person add customerID int null;
then bulk inserts:
-- inserting customerID
insert into Person(name, customerID)
select name, customerID from Customer
-- joining on customerID.
insert into Address(person_id, address)
select p.person_id, c.address
from Customer c
join Person p on c.customerID = p.customerID
after that you can remove customerID field from Person table:
alter table Person drop column customerID
It's better that you create some field of unique types in both table are related them.otherwise you want join as you dont have unique field for condition
精彩评论