I've got a really big problem, and it stems from a table with 50k+ records.
This table looks something like this (+15 or so more columns that aren't too important):
table_1
da开发者_开发知识库te | name | email | num_x | num_y
I also have another table ON A DIFFERENT DB (same server) that looks something like this (+1 not important column):
table_2
name | comment | status
table_1 is updated daily with new entries (it is a feed table for use on other projects), which means there are a lot of repeat "name" rows. This is intended. table_2 contains comments and status notes about "name"s, but no repeat "name"s.
I need to write a query that will select all "name"s from table_1 where the total of all num_x + num_y > X. So, for example, if this were a few rows...
2010-11-19 | john.smith | john.smith@example.com | 20 | 20
2010-11-19 | joel.schmo | joel.schmo@example.com | 10 | 10
2010-11-18 | john.smith | john.smith@example.com | 20 | 20
2010-11-18 | joel.schmo | joel.schmo@example.com | 10 | 10
.. and I needed to find all "name"s with total num_x + num_y > 50, then I'd return john.smith | john.smith@example.com | 80 . I would also return john.smith's status and comment from the other DB.
I wrote a query that I believe works fine, but it's problematic because it takes forever and a day to run. I also successfully retrieve records from the other db (I don't have that listed below).
SELECT
name,
email,
SUM(num_x + num_y) AS total
FROM
table_1
GROUP BY
name
HAVING
SUM(num_x + num_y) > 100
ORDER BY
total ASC
Is there a better way to go about this?
Thanks everyone!
Dylan
Why do you repeat the sum in GHAVING rather than repeat total? Unless im missing something, there is no difference in results and avoiding the second sum would save time
If you can skip the ORDER BY clause and don't mind the slightly different select, I think you'll get some amount of speed up by splitting up the sum. I have a small database and have tested that its a valid query and results are correct, but its not nearly large enough to quantify the performance difference.
SELECT
name,
email,
SUM(num_x) as sumX, SUM(num_y) AS sumY
FROM
table_1
GROUP BY
name
HAVING
sumX + sumY > 100
An index on name is a no-brainer. That's the simplest thing that will speed it up.
Create an index for name, this will improve the performance:
ALTER TABLE `table_1` ADD INDEX (`name`);
But, redesigning your databases would be my recomendation. Create an artificial key for names, something like id_name | name | email, beeing id_name an integer auto_increment, this way you'll have a better performance.
Try:
SELECT
name,
email,
num_x + num_y AS total
FROM
table_1
WHERE
num_x + num_y > 100
ORDER BY
total ASC
Just getting rid of the grouping should make quite a significant difference.
maybe change the database the sum is made everytime you change x or y but it really depends of how often you change them... Otherwise you can try to do the sum only once... but I don't see why you do a order by on only one table if you've got a primary key...
精彩评论