I have a table with many IDs and many dates associated with each ID, and even a few IDs with no date. For each ID and date combination, I want to开发者_如何转开发 select the ID, date, and the next largest date also associated with that same ID, or null as next date if none exists.
Sample Table:
ID Date
1 5/1/10
1 6/1/10
1 7/1/10
2 6/15/10
3 8/15/10
3 8/15/10
4 4/1/10
4 4/15/10
4
Desired Output:
ID Date Next_Date
1 5/1/10 6/1/10
1 6/1/10 7/1/10
1 7/1/10
2 6/15/10
3 8/15/10
3 8/15/10
4 4/1/10 4/15/10
4 4/15/10
SELECT
mytable.id,
mytable.date,
(
SELECT
MIN(mytablemin.date)
FROM mytable AS mytablemin
WHERE mytablemin.date > mytable.date
AND mytable.id = mytablemin.id
) AS NextDate
FROM mytable
This has been tested on SQL Server 2008 R2 (but it should work on other DBMSs) and produces the following output:
id date NextDate ----------- ----------------------- ----------------------- 1 2010-05-01 00:00:00.000 2010-06-01 00:00:00.000 1 2010-06-01 00:00:00.000 2010-06-15 00:00:00.000 1 2010-07-01 00:00:00.000 2010-08-15 00:00:00.000 2 2010-06-15 00:00:00.000 2010-07-01 00:00:00.000 3 2010-08-15 00:00:00.000 NULL 3 2010-08-15 00:00:00.000 NULL 4 2010-04-01 00:00:00.000 2010-04-15 00:00:00.000 4 2010-04-15 00:00:00.000 2010-05-01 00:00:00.000 4 NULL NULL
Update 1: For those that are interested, I've compared the performance of the two variants in SQL Server 2008 R2 (one uses MIN aggregate and the other uses TOP 1 with an ORDER BY):
Without an index on the date column, the MIN version had a cost of 0.0187916 and the TOP/ORDER BY version had a cost of 0.115073 so the MIN version was "better".
With an index on the date column, they performed identically.
Note that this was testing with just these 9 records so the results could be (very) spurious...
Update 2: The results hold for 10,000 uniformly distributed random records. The TOP/ORDER BY query takes so long to run at 100,000 records I had to cancel it and give up.
If your db is oracle, you can use lead() and lag()
functions.
SELECT id, date,
LEAD(date, 1, 0) OVER (PARTITION BY ID ORDER BY Date DESC NULLS LAST) NEXT_DATE,
FROM Your_table
ORDER BY ID;
SELECT
id,
date,
( SELECT date
FROM table t1
WHERE t1.date > t2.date
ORDER BY t1.date LIMIT 1 )
FROM table t2
I think self JOIN
would be faster than subselect.
WITH dates AS (
SELECT 1 AS ID, '2010-05-01' AS Date
UNION ALL SELECT 1, '2010-06-01'
UNION ALL SELECT 1, '2010-07-01'
UNION ALL SELECT 2, '2010-06-15'
UNION ALL SELECT 3, '2010-08-15'
UNION ALL SELECT 3, '2010-08-15'
UNION ALL SELECT 4, '2010-04-01'
UNION ALL SELECT 4, '2010-04-15'
UNION ALL SELECT 4, ''
)
SELECT
dates.ID,
dates.Date,
nextDates.Date AS Next_Date
FROM
dates
LEFT JOIN
dates nextDates
ON nextDates.ID = dates.ID
AND nextDates.Date > dates.Date
LEFT JOIN
dates noLower
ON noLower.ID = nextDates.ID
AND noLower.Date < nextDates.Date
AND noLower.Date > dates.Date
WHERE
dates.Date > 0
AND noLower.ID IS NULL
https://www.db-fiddle.com/f/4sWRLt2hxjik5HqiJ21ez8/1
精彩评论