Hallo guys,
I'm running a very large database (ATM like >5 Million datasets). My database stores custom generated numbers (which and how they compose doesn't really matters here) and the corresponding date to this one. In addition there is a ID stored for every product (means one product can have multiple entries for different dates in my database -> primary key is divided). Now I want to SELECT
those top 10 ID's which got the largest difference in theire numbers in the last two days. Currently I tried to achieve this using J开发者_如何转开发OINS but since I got that much datasets this way is far to slow. How could I speed up the whole operation?
SELECT
d1.place,d2.place,d1.ID
FROM
daily
INNER JOIN
daily AS d1 ON d1.date = CURDATE()
INNER JOIN
daily as d2 ON d2.date = DATE_ADD(CURDATE(), INTERVAL -1 DAY)
ORDER BY
d2.code-d1.code LIMIT 10
EDIT: Thats how my structure looks like
CREATE TABLE IF NOT EXISTS `daily` (
`ID` bigint(40) NOT NULL,
`source` char(20) NOT NULL,
`date` date NOT NULL,
`code` int(11) NOT NULL,
`cc` char(2) NOT NULL,
PRIMARY KEY (`ID`,`source`,`date`,`cc`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Thats the output of the Explain
Statement
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE d1 ALL PRIMARY NULL NULL NULL 5150350 Using where; Using temporary; Using filesort
1 SIMPLE d2 ref PRIMARY PRIMARY 8 mytable.d1.ID 52 Using where
How about this?
SELECT
d1.ID, d1.place, d2.place
FROM
daily AS d1
CROSS JOIN
daily AS d2
USING (ID)
WHERE
d1.date = CURDATE()
AND d2.date = CURDATE() - INTERVAL 1 DAY
ORDER BY
d2.code - d1.code DESC
LIMIT
10
Some thoughts about your table structure.
`ID` bigint(40) NOT NULL,
Why BIGINT? You would need to be doing 136 inserts/s 24h a day, 7 days a week for a year to exhaust the range of INT. And before you get halfway there, your application will probably need a professional DBA anyway. Remember, Smaller primary index leads to fater lookups - which brings us to:
PRIMARY KEY (`ID`,`source`,`date`,`cc`)
Why? A single column PK on ID
column should be enough. If you need indexes on other columns, create additional indexes (and to it wisely). As it is, you basically have a covering index for entire table... which is like having entire table in the index.
Last but not least: where is place
column? You've used it in your query (and then I in mine), but it's nowhere to be seen?
Proposed table structure:
CREATE TABLE IF NOT EXISTS `daily` (
`ID` int(10) UNSIGNED NOT NULL, --usually AUTO_INCREMENT is used as well,
`source` char(20) NOT NULL,
`date` date NOT NULL,
`code` int(11) NOT NULL,
`cc` char(2) NOT NULL,
PRIMARY KEY (`ID`),
KEY `ID_date` (`ID`,`date`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
精彩评论