开发者

GROUP BY and ORDER BY too slow. How to make faster?

开发者 https://www.devze.com 2023-03-22 18:20 出处:网络
I\'ve trying to create some stats for my table but it has over 3 million rows so it is really slow. I\'m trying to find the most popular value for column name and also showing how many times it pops

I've trying to create some stats for my table but it has over 3 million rows so it is really slow.

I'm trying to find the most popular value for column name and also showing how many times it pops up.

I'm using this at the momment but it doesn't work cause its too slow and I just get errors.

    $total = mysql_query("SELECT `name`, COUNT(*) as b FROM `people` GROUP BY `name` ORDER BY `b` DESC LIMIT 0,5;")or die(mysql_error());

As you may see I'm trying to get all the names and how many times that name has been used but only show the top 5 to hopefully speed it up.

I would like to be able to then do get the values like

   while($row = mysql_fetch_array($result)){
        echo $row['name'].': '.$row['b']."\r\n";
   }

And it will sho开发者_开发知识库w things like this;

  Bob: 215
  Steve: 120
  Sophie: 118
  RandomGuy: 50
  RandomGirl: 50

I don't care much about ordering the names afterwards like RandomGirl and RandomGuy been the wrong way round.

I think I've have provided enough information. :) I would like the names to be case-insensitive if possible though. Bob should be the same as BoB, bOb, BOB and so on.

Thank-you for your time Paul


Limiting results on the top 5 won't give you a lot of speed-up, you'll gain time in the result retrieval, but in mySQL side the whole table still needs to be parsed (to count).

You will speed-up your count query having index on name column, of course as only the index will be parsed and not the table.

Now if you really want to speed up the result and avoid parsing the name index when you need this result (which will still be quite slow if you really have millions of rows), then the only other solution is computing the stats when inserting, deleting or updating rows on this table. That is using triggers on this table to maintain a statistics table near this one. Then you will really only have a simple select query on this statistics table, with only 5 rows parsed. But you will slow down your inserts, delete and update operations (which are already quite slow, especially if you maintain indexes, so if the stats are important you should study this solution).


Do you have an index on name? It might help.


Since you are doing the counting/grouping and then sorting an index on name doesn't help at all MySql should go through all rows every time, there is no way to optimize this. You need to have a separate stats table like this:

CREATE TABLE name_stats( name VARCHAR(n), cnt INT, UNIQUE( name ), INDEX( cnt ) )

and you should update this table whenever you add a new row to 'people' table like this:

INSERT INTO name_stats VALUES( 'Bob', 1 ) ON DUPLICATE KEY UPDATE cnt = cnt + 1;

Querying this table for the list of top names should give you the results instantaneously.

0

精彩评论

暂无评论...
验证码 换一张
取 消