I have a table, derived from several different tables, with several similar (near duplicate) rows and need to resolve similar rows into single rows. Using DISTINCT or UNION resolves the rows but in an arbitrary manner. I want to phrase rules for preferring one row over the other. For example:
Table: Foo
FieldA, FieldB, FieldC, FieldD, FieldE
abc123, 123abc, 开发者_如何学运维short, short, extra
abc123, 123abc, Longer, short, extra
abc123, 123abc, short, Longer, extra
abc123, 123abc, Longer, Longer, extra
abc123, 123abc, short, short, extra
The relationships are:
- FieldA and FieldB form uniqueness (i.e. 1 row for each unique set of these two fields)
- The longer FieldC and FieldD are the better
- I need to perform a simple WHERE clause on FieldE Performing a GROUP BY can resolve to unique FiledA & FieldB combinations (relationship 1) but how do I resolve which of the rows within the grouping get picked.
The desired result from the table above would be the 4th row:
abc123, 123abc, Longer, Longer, extrayou will use a combination of GROUP BY
and HAVING count(*) > 1
clauses.
Ok here's the setup (using your elegantly named sample columns):
CREATE TABLE dbo.Foo
(
FieldA varchar(50) NOT NULL,
FieldB varchar(50) NOT NULL,
FieldC varchar(50) NOT NULL,
FieldD varchar(50) NOT NULL,
FieldE varchar(50) NOT NULL
)
go
insert into foo values('abc123', '123abc', '01', '01', '')
insert into foo values('abc123', '123abc', '012', '012', '')
insert into foo values('abc123', '123abc', '0123', '01', '')
insert into foo values('abc123', '123abc', '01234567', '01', '')
insert into foo values('abc123', '123abc', '012345', '012345', '')
insert into foo values('def123', '123def', '012345', '012345', '')
insert into foo values('def123', '123def', '', '012345', '')
And the solution:
select * from
(
select *, row_number() over(partition by fielda, fieldb order by len(fieldc) + len(fieldd) desc) seq
from foo
) ordered
where seq = 1
This will give you unique combinations of your first two columns, and the longest (defined as length of col3 + the length of col4)
Pull out the where clause to see the values it came up for all rows.
精彩评论