Here's our current table
CREATE TABLE Visitor
(
VisitorID bigint,
DayPhone varchar(50),
NightPhone varchar(50)
)
I want to migrate this data to a separate table:
CREATE TABLE VisitorPhone
(
VisitorID bigint,
Label varchar(50), --Day, Night, Work, Cell, etc.
Phone varchar(50)
)
My thought was that the most efficient way to would be to do this:
INSERT INTO VisitorPhone(VisitorID, Label, Phone)
SELECT VisitorID, 'day', DayPhone FROM dbo.Visitor WHERE DayPhone IS NOT NULL AND DayPhone <> ''
INSERT INTO VisitorPhone(VisitorID, Label, Phone)
SELECT VisitorID, 'night', NightPhone FROM 开发者_运维百科dbo.Visitor WHERE NightPhone IS NOT NULL AND NightPhone <> ''
What are my other options? We've talked about everything from Sql CLR functions, Temp Tables, ADO.NET, you name it. What's truly the most efficient way of doing this? Keep in mind that DayPhone, and NightPhone are not part of an index, and that I have 16MM+ visitors records which will equate to somewhere between ~16MM - ~32MM VisitorPhone records.
I would have done the migration like you already suggested. You problem is that a visitor row generates zero, one or two rows in the VisitorPhone table. If it was Oracle, you would have had the "INSERT ALL" syntax which let's you do just this. Maybe some similar syntax is available in SQL Server?
Any procedural approach is likely to be outperformed by a set based approach.
You can do something complicated like joining to a dummy table and determine how many times each Visitor row will be duplicated (0 = has no phone, 1 has either day/night phone, 2 has both). You would then use case-when logic to determine how to encode the row.
30 million rows is not a huge amount of data on something bigger than your typical development laptop. I think that finding and testing an alternative approach would take longer than just execute the two statements. Plus, your current solution is easily documented.
Just be sure to create the indexes afterwards.
精彩评论