How do I optimize this MySQL query?_问答_开发者

开发者 https://www.devze.com 2023-01-26 14:53 出处：网络

I\'ve got a really big problem, and it stems from a table with 50k+ records. This table looks something like this (+15 or so more columns that aren\'t too important):

相关专题：

I've got a really big problem, and it stems from a table with 50k+ records.

This table looks something like this (+15 or so more columns that aren't too important):

table_1
da开发者_开发知识库te | name | email | num_x | num_y

I also have another table ON A DIFFERENT DB (same server) that looks something like this (+1 not important column):

table_2
name | comment | status

table_1 is updated daily with new entries (it is a feed table for use on other projects), which means there are a lot of repeat "name" rows. This is intended. table_2 contains comments and status notes about "name"s, but no repeat "name"s.

I need to write a query that will select all "name"s from table_1 where the total of all num_x + num_y > X. So, for example, if this were a few rows...

2010-11-19 | john.smith | john.smith@example.com | 20 | 20  
2010-11-19 | joel.schmo | joel.schmo@example.com | 10 | 10  
2010-11-18 | john.smith | john.smith@example.com | 20 | 20  
2010-11-18 | joel.schmo | joel.schmo@example.com | 10 | 10

.. and I needed to find all "name"s with total num_x + num_y > 50, then I'd return john.smith | john.smith@example.com | 80 . I would also return john.smith's status and comment from the other DB.

I wrote a query that I believe works fine, but it's problematic because it takes forever and a day to run. I also successfully retrieve records from the other db (I don't have that listed below).

SELECT        
    name,                                
    email,
    SUM(num_x + num_y) AS total
FROM
    table_1    
GROUP BY
   name
 HAVING
    SUM(num_x + num_y) > 100
ORDER BY
     total ASC

Is there a better way to go about this?

Thanks everyone!

Dylan

Why do you repeat the sum in GHAVING rather than repeat total? Unless im missing something, there is no difference in results and avoiding the second sum would save time

If you can skip the ORDER BY clause and don't mind the slightly different select, I think you'll get some amount of speed up by splitting up the sum. I have a small database and have tested that its a valid query and results are correct, but its not nearly large enough to quantify the performance difference.

SELECT        
   name,                                
   email,
   SUM(num_x) as sumX, SUM(num_y) AS sumY
FROM
   table_1    
GROUP BY
   name
HAVING
   sumX + sumY > 100

An index on name is a no-brainer. That's the simplest thing that will speed it up.

Create an index for name, this will improve the performance:

ALTER TABLE `table_1` ADD INDEX (`name`);

But, redesigning your databases would be my recomendation. Create an artificial key for names, something like id_name | name | email, beeing id_name an integer auto_increment, this way you'll have a better performance.

Try:

SELECT         
    name,                                 
    email, 
    num_x + num_y AS total 
FROM 
    table_1     
WHERE
    num_x + num_y > 100 
ORDER BY 
     total ASC

Just getting rid of the grouping should make quite a significant difference.

maybe change the database the sum is made everytime you change x or y but it really depends of how often you change them... Otherwise you can try to do the sum only once... but I don't see why you do a order by on only one table if you've got a primary key...