Consolidating values and deleting duplicates in a junction table using CTE_问答_开发者

Consolidating values and deleting duplicates in a junction table using CTE

开发者 https://www.devze.com 2022-12-27 08:44 出处：网络

I have the following schema: ParcelsSegmentsSegmentsParcels ==================================== ParcelIDSegmentIDParcelID

I have the following schema:

Parcels       Segments        SegmentsParcels
=========     ==========      =================
ParcelID      SegmentID       ParcelID
...           Name            SegmentID
              ...             id

A user of the data wants to consolidate Segments.Names and gave me a list of current Segment.Names mapped to new Segment.Names (all of which currently exist).

So now I have this list in 开发者_如何学Pythona temporary table with the currentID and newID to map to.

What I want to do is update the SegmentID in SegmentsParcels based on this map. I could use the statement:

update SegmentParcels set segmentID = [newID] from newsegments where segmentID = currentid

but this will create some duplicates I have a unique constraint on ParcelID and SegmentID in SegmentParcels.

What is the best way to go about this? I considered removing the constraint and then dealing with removing the duplicates (which I did at one point and could probably do again) but I was hoping there was a simpler way.

try something like this:

CREATE TABLE #Results (oldID int, newid int)

update sp
    set segmentID = dt.newID
    OUTPUT sp.segmentID, dt.newID
        INTO #Results
    FROM SegmentParcels sp
    INNER JOIN (SELECT DISTINCT
                    segmentID, newID
                    FROM newsegments              ns 
                        INNER JOIN SegmentParcels sp ON ns.currentid = sp.segmentID
                        LEFT OUTER JOIN SegmentParcels sp2 ON ns.currentid = sp2.segmentID AND sp.ParcelID=sp2.ParcelID
                    WHERE sp2.ParcelID IS NULL

               ) dt ON dt.segmentID=sp.segmentID

I threw in the OUTPUT clause which is SQL Server 2005 and up, just because it might be helpful if you have a large script running. This way you have the old and new values.

I ended up dropping the constraint and deleting the duplicates using a common table expression.
Here is the sql I used:

-- drop unique constraint
ALTER TABLE [dbo].[SegmentParcels] DROP CONSTRAINT [uc_SegmentID_ParcelID]
GO

-- update segment ids to new values from map
update SegmentParcels
set segmentID = [newID] from newsegments where segmentID = currentid
GO

-- use common table expression to delete duplicates
WITH Duplicates(SegmentID, ParcelID, Id)
AS
(
    SELECT SegmentID, ParcelID, Min(Id) Id
    FROM SegmentParcels
    GROUP BY segmentID, parcelID
    HAVING Count(*) > 1
)
DELETE FROM SegmentParcels
WHERE Id IN (
    SELECT SegmentParcels.Id
    FROM SegmentParcels
    INNER JOIN Duplicates
    ON SegmentParcels.segmentID = Duplicates.SegmentID
    AND SegmentParcels.parcelID = Duplicates.ParcelID
    AND SegmentParcels.Id <> Duplicates.Id
) 

-- add unique constraint back
ALTER TABLE [dbo].[SegmentParcels] ADD  CONSTRAINT [uc_SegmentID_ParcelID] UNIQUE NONCLUSTERED 
(
    [segmentID] ASC,
    [parcelID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO