开发者

How do I join the first row of a subquery?

开发者 https://www.devze.com 2023-02-05 04:43 出处:网络
I\'ve got a table of invoices and a child table of related data related by key.In particular, for each invoice, I\'m interested in only the first related row from the child table.Given that I want the

I've got a table of invoices and a child table of related data related by key. In particular, for each invoice, I'm interested in only the first related row from the child table. Given that I want the one related row for every invoice key - how do I accomplish this?

Select i.[Invoice Number],
       c.[Carrier Name]
From Invoice i
    Left Join Carriers c on i.[InvoiceKey] = c.[InvoiceKey]
Where -- what?

I guess semantically speaking, what I'm looking for something akin to the concept of Top 1 c.CarrierName Group by InvoiceKey (or what would be the concept of that if that were possible in T-SQL.)

I've th开发者_StackOverflow中文版ought about doing a left join on a subquery, but that doesn't seem very efficient. Does anyone have any T-SQL tricks to achieve this efficiently?

Edit: Sorry guys, I forgot to mention this is SQL Server 2000, so while I'm going to give upvotes for the current SQL Server 2005/2008 responses that will work, I can't accept them I'm afraid.


Provided that Carriers has a PRIMARY KEY called id:

SELECT  i.[Invoice Number],
        c.[Carrier Name]
FROM    Invoice i
JOIN    Carriers c
ON      c.id = 
        (
        SELECT  TOP 1 ID
        FROM    Carriers ci
        WHERE   ci.InvoiceKey = i.InvoiceKey
        ORDER BY
                id -- or whatever
        )


;with cteRowNumber as (
    select c.InvoiceKey, c.[Carrier Name], ROW_NUMBER() over (partition by c.InvoiceKey order by c.[Carrier Name]) as RowNum
        from Carriers c
)
select i.[Invoice Number],
       rn.[Carrier Name]
    from Invoice i
        left join cteRowNumber rn
            on i.InvoiceKey = rn.InvoiceKey
                and rn.RowNum = 1


This works for me:

select ir.[Invoice Number], c.[Carrier Name]
from 
    (select ROW_NUMBER() over (order by i.[Invoice Number] asc) AS RowNumber, i.[Invoice Number], i.InvoiceKey
    from Invoice i) AS ir
left join Carriers c
on ir.InvoiceKey = c.InvoiceKey
where RowNumber = 1
union all
select ir.[Invoice Number], NULL as [Carrier Name]
from 
    (select ROW_NUMBER() over (order by i.[Invoice Number] asc) AS RowNumber, i.[Invoice Number]
    from Invoice i) AS ir
where RowNumber > 1

or

select TOP 1 i.[Invoice Number], c.[Carrier Name]
from Invoice i
left join Carriers c
on i.InvoiceKey = c.InvoiceKey
union all
select ir.[Invoice Number], NULL as [Carrier Name]
from 
    (select ROW_NUMBER() over (order by i.[Invoice Number] asc) AS RowNumber, i.[Invoice Number]
    from Invoice i) AS ir
where RowNumber > 1


This is how I would do it, using a slightly different syntax than yours (MySQL style), but I guess you could apply it to your solution as well:

SELECT i.invoiceNumber, c.carrierName
FROM Invoice as i
LEFT JOIN Carriers as c ON (c.id = (SELECT id FROM Carriers WHERE invoiceKey = i.invoiceKey ORDER BY id LIMIT 1))

This will take all records from Invoice, and join it with one (or zero) record from Carriers, specifically the record which has the same invoiceKey and only the first one.

As long as you have an index on Carriers.invoiceKey the performance of this query should be acceptable.

Sebastian


Alternatively you could use OUTER APPLY as well. Please notice the use of angle brackets for unknown field names:

Select i.[Invoice Number], c.[Carrier Name], x.<Carrier_field1>
From Invoice i
OUTER APPLY 
(
    SELECT TOP 1
    FROM Carriers c 
    WHERE c.[InvoiceKey] = i.[InvoiceKey]
    ORDER BY <order _clause>
) x


In such cases I often employ a device which I here apply to your example and describe below:

SELECT
  i.[Invoice Number],
  c.[Carrier Name]
FROM Invoice i
  INNER JOIN Carriers c ON i.InvoiceKey = c.InvoiceKey
  INNER JOIN (
    SELECT MIN(ID) AS ID
    FROM Carriers
    GROUP BY InvoiceKey
  ) c_top ON c.ID = c_top.ID

I think, this is roughly what Quassnoi has posted, only I try to avoid using SELECT TOPs like that.

Invoice is joined with Carriers based on their linking expression (InvoiceKey in this case). Now, Carriers can have multiple rows for the same InvoiceKey, so we need to limit the output. And that is done using a derived table.

The derived table groups rows from Carrier based on the same expression that is used for linking the two tables (InvoiceKey).

And there's another way: instead of joining the derived table you could use IN (subquery) with the same effect. That is, the complete query would then look like this:

SELECT
  i.[Invoice Number],
  c.[Carrier Name]
FROM Invoice i
  INNER JOIN Carriers c ON i.InvoiceKey = c.InvoiceKey
    AND c.ID IN (SELECT MIN(ID) FROM Carriers GROUP BY InvoiceKey)


group by carriername having max(invoicenumber)

to get the first carrier for each invoice:

group by invoicenumber having max(carriername)
-- substitute the column you want to order by for carrier name to change which is 'first'
0

精彩评论

暂无评论...
验证码 换一张
取 消