I have the following query:
SELECT
`date`,
al.cost,
SUM(l.bnd) AS bnd,
l.type,
count(*) AS amount
FROM alogs AS al
INNER JOIN logs AS l ON al.logid = l.id
WHERE
aid = 0 AND
l.`date` >= '2010-01-17' AND
l.`date` <= '2011-04-19'
GROUP BY l.`date`, l.type
logs counts 5 million rows
alogs counts 4.3 million rows execution time is about 90 seconds.I have:
a primary key on logs.id (auto inc) an index (BTREE) on logs.date an index (BTREE) on alogs.logid an index (BTREE) on alogs.aidI tried:
- an index (BTREE) on logs.type, but that didn't improve anything (I think because type can only be 2 things) - a covering index on logs.date and logs.type - partitioning the logs table per month internally, but with the timespan used above (which covered all partitions) it'd even become slower, can't partition on aid because there are more than 2k different id's - stripping down functions from the query untill it became fast to see where the problem lies. I only had to remove the GROUP clause (and the SUM() and count(*) functions to get proper results) and by doing so the execution time went subsecond. - remove the group clause and group in memory, but the result of over 3 million rows was just too much and would e开发者_开发技巧ven take longer.Are there any other things I can do but I don't know of? If so I'd love to hear about it!
Thanks,
lordstyx
EDIT (2011-04-22 11:30) Here's the EXPLAIN EXTENDED result
id| select_type| table| type | possible_keys| key | key_len| ref | rows | Extra
1 | SIMPLE | al | ref | logid,aid | adid | 4 | const | 3010624| Using temporary; Using filesort
1 | SIMPLE | l | eq_ref| PRIMARY,date | PRIMARY| 4 | al.logid| 1 | Using where
If your date range is a small set of the rows in the log table, you want it to use that index. You said that you created an index on logs.date, but you need a composite index on (logs.date, logs.id) so mysql doesn't have to read the row off disk to get the id to join to the alogs table. You'd want an index on alogs(log_id) for the join.
You could probably squeeze out some more by putting the columns fromt he SELECT in the indexes as well, so
logs(date, id, bnd, type)
alogs(log_id, aid, cost)
if the aid
filter cuts out a large portion of rows, something like this would pare down the data that's being joined against:
SELECT
l.`date`,
sum(al.cost) as cost,
SUM(l.bnd) AS bnd,
l.type,
sum(qty) AS amount
FROM logs AS l
INNER JOIN
(
SELECT logid, sum(cost) as cost, COUNT(*) as qty
FROM alogs
WHERE aid = 0
GROUP BY logid
) al ON al.logid = l.id
GROUP BY l.`date`, l.type
Without knowing a little more about the data structure (many alogs
records per logs
record?) it's difficult to suggest further improvements. Computing data prior to joins and GROUP BY
clauses can greatly speed up execution by reducing the total number of rows that need to be processed. Since the query without grouping returns blazingly fast, further indexing and tweaking is unlikely to increase execution speed.
精彩评论