开发者

SQL query help with non-unique duplicates

开发者 https://www.devze.com 2023-01-28 00:38 出处:网络
I can\'t think through this one.I have this query: SELECT p.person_id, p.first_nm, p.last_nm, pu.purchase_dt,

I can't think through this one. I have this query:

SELECT 
    p.person_id,
    p.first_nm,
    p.last_nm, 
    pu.purchase_dt,
    pr.sku, 
    pr.description,
    a.address_type_id,
    a.city_cd, 
    a.state_cd, 
    a.postal_cd
FROM 
    person p 
    INNER JOIN address a ON p.person_id = a.person_id
    INNER JOIN purchase pu ON pu.person_id = p.person_id
    INNER JOIN product pr ON pr.product_id = pu.product_id

Simple enough - I just need to get the information for customers that we've shipped returns to. However, because of the addressType table

AddressType

address_type_id    address_type_desc
------------------------------------
1            Home
2            Shipping

some customers have multiple addresses in the address table, creating non-unique 开发者_如何学Pythonduplicate entries like this.

1,Smith, John, 12/01/2009, A12345, Purple Widget, 1, Anywhere, CA, 12345
1,Smith, John, 12/01/2009, A12345, Purple Widget, 2, Somewhere, ID, 54321

I'd like to get the query to return just one row/person and return the home address if available otherwise, return the shipping address.

This seems simple enough, and maybe it's just my cold, but this is causing me to scratch my head somewhat.


you want to change your join so it returns the min(addressID) instead of all of them:

        INNER JOIN address a ON p.person_id = a.person_id
        inner join (select person_id, min(address_type_id) as min_addr 
from address group by person_id) a_min 
on a.person_id = a_min.person_id and a.address_type_id = a_min.min_addr


SELECT 
    p.person_id,
    p.first_nm,
    p.last_nm, 
    pu.purchase_dt,
    pr.sku, 
    pr.description,
    COALESCE(ha.address_type_id, sa.address_type_id) AS address_type_id
    CASE WHEN ha.address_type_id IS NOT NULL THEN ha.city_cd ELSE sa.city_cd END AS city_cd, 
    CASE WHEN ha.address_type_id IS NOT NULL THEN ha.state_cd ELSE sa.state_cd END AS state_cd, 
    CASE WHEN ha.address_type_id IS NOT NULL THEN ha.postal_cd ELSE sa.postal_cd END AS postal_cd
FROM 
    person p 
    LEFT JOIN address ha ON p.person_id = ha.person_id AND ha.address_type_id = 1
    LEFT JOIN address sa ON p.person_id = sa.person_id AND sa.address_type_id = 2
    INNER JOIN purchase pu ON pu.person_id = p.person_id
    INNER JOIN product pr ON pr.product_id = pu.product_id


If SQL Server, or other version with common table expressions (CTE), you could do the following. The CTE adds a row-number column that is grouped by person and ordered by the address_type_id. The main query is altered to return number 1 row for each person from the CTE.

WITH cte AS
    (
    SELECT
         a.person_id,
         a.address_type_id,
         a.city_cd, 
         a.state_cd, 
         a.postal_cd,
         ROW_NUMBER() over (PARTITION BY person_id ORDER BY address_type_id) AS sequence
    FROM address a
    INNER JOIN AddressType at ON a.address_type_id = at.address_type_id
    )

    SELECT 
        p.person_id,
        p.first_nm,
        p.last_nm, 
        pu.purchase_dt,
        pr.sku, 
        pr.description,
        a.address_type_id,
        a.city_cd, 
        a.state_cd, 
        a.postal_cd
    FROM 
        person p 
        INNER JOIN cte a ON p.person_id = a.person_id
        INNER JOIN purchase pu ON pu.person_id = p.person_id
        INNER JOIN product pr ON pr.product_id = pu.product_id
    WHERE
        a.sequence = 1

By the way, if you have person records that have no addresses, you might want to change the INNER JOIN to an OUTER JOIN on the addresses table (cte in my answer). This may also be appropriate for joins to purchase and product if your requirements indicate so.

0

精彩评论

暂无评论...
验证码 换一张
取 消