I'm trying to get this query to work.
The query was written by "OMG Ponies" as an answer to: fix mysql query to return random row within subgroup
The query below calculates correctly the difference in dates, but then fails to select the ROW (within ID1-ID2 pairs) with the minimum value of that difference.
DROP TABLE IF EXISTS temp4;
CREATE TABLE temp4 AS
SELECT x.id1,
x.id2,
x.YEAR,
x.MMDD,
x.id3,
x.id3_YEAR,
x.id3_MMDD
FROM (SELECT t.*,
ABS(DATEDIFF(CONCAT(CAST(t.id3_YEAR AS CHAR(4)),'-', LEFT(t.id3_MMDD,2),'-',RIGHT(t.id3_MMDD,2)),
CONCAT(CAST(t.YEAR AS CHAR(4)),'-', LEFT(t.MMDD,2),'-',RIGHT(t.MMDD,2)))) AS diff,
CASE
WHEN @id1 = t.id1 AND @id2 = t.id2 THEN @rownum := @rownum + 1
ELSE @rownum := 1
END AS rk,
@开发者_StackOverflow社区id1 := t.id1,
@id2 := t.id2
FROM temp3 t
JOIN (SELECT @rownum := 0, @id1 := 0, @id2 := 0) r
ORDER BY t.id1, t.id2, diff, RAND()) x
WHERE x.rk = 1;
I'm using the query to randomly draw one row within each group defined by a ID1-ID2 pair. I want the ID3 with minimum difference in dates to YEAR-MMDD (i.e. the absolute difference between YEAR-MMDD and YEAR_ID3-MMDD_ID3 should be minimized). If there is more than one with the exact same date, the query should select one at random.
If this were the table...
ID1 ID2 YEAR MMDD ID3 YEAR_ID3 MMDD_ID3
---------------------------------------
1 2 1991 0821 55 1991 0822
1 2 1991 0821 57 1991 0822
1 2 1991 0821 88 1992 0101
1 3 1990 0131 89 2000 0202
1 3 1990 0131 89 2001 0102
Then the query should return
1,2,1991,0821,55 (OR 1,2,1991,0821,57 - ACCORDING TO THE RANDOM DRAW)
1,3,1990,0131,89
I'm pasting here a SQL DUMP of a TEST TABLE...
DROP TABLE IF EXISTS `temp3`;
CREATE TABLE IF NOT EXISTS `temp3` (
`id1` char(7) NOT NULL,
`id2` char(7) NOT NULL,
`YEAR` year(4) NOT NULL,
`MMDD` char(4) NOT NULL,
`id3` char(7) NOT NULL,
`id3_YEAR` year(4) NOT NULL,
`id3_MMDD` char(4) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO `temp3` VALUES('1', '2', 1992, '0107', '55', 1991, '0528');
INSERT INTO `temp3` VALUES('1', '2', 1992, '0107', '57', 1991, '0701');
INSERT INTO `temp3` VALUES('1', '3', 1992, '0107', '88', 2000, '0101');
INSERT INTO `temp3` VALUES('1', '3', 1992, '0107', '44', 2000, '0101');
This is a working solution. Thanks @OMG Ponies for your help.
SELECT
x.id1,
x.id2,
x.YEAR,
x.MMDD,
x.id3,
x.id3_YEAR,
x.id3_MMDD
FROM
( SELECT
t.*,
@rownum := CASE
WHEN @id1 = t.id1 AND @id2 = t.id2 THEN @rownum + 1
ELSE 1
END AS rk,
@id1 := t.id1,
@id2 := t.id2
FROM
( SELECT
t.*,
ABS(DATEDIFF(CONCAT(CAST(t.id3_YEAR AS CHAR(4)),'-', LEFT(t.id3_MMDD,2),'-',RIGHT(t.id3_MMDD,2)),
CONCAT(CAST(t.YEAR AS CHAR(4)),'-', LEFT(t.MMDD,2),'-',RIGHT(t.MMDD,2)))) AS diff
FROM temp3 t
ORDER BY t.id1, t.id2, diff, RAND()
) t,
( SELECT @rownum := 0, @id1 := null, @id2 := null ) r
) x
WHERE x.rk = 1;
精彩评论