开发者

How to do fast counting on large tables?

开发者 https://www.devze.com 2023-01-18 17:15 出处:网络
I have large MySQL tables with hundreds of thousands of rows. I need to write a query on a customers table which gets the count of when customers will be available to contact again.

I have large MySQL tables with hundreds of thousands of rows.

I need to write a query on a customers table which gets the count of when customers will be available to contact again.

eg.

SELECT 'This week', COUNT(*) FROM customers 
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) < 7

UNION

SELECT 'Next week', COUNT(*) FROM customers 
WHERE sales_person_id = 1 AND DATEDIFF(NOW(), available_date) >= 7 
    AND DATEDIFF(NOW(), available_date) < 14

UNION

... (a few more like this)

Having written a similar query on a different large table, I did notice that changing the engine from InnoDB to MyISAM sped up the qu开发者_JAVA技巧ery considerably (InnoDB is not needed for these tables as they do not have foreign key checks). Is there anything else I can do to speed up counts like this (other than indexing the appropriate fields)?


WHERE sales_person_id = 1 AND available_date BETWEEN CURDATE() - INTERVAL 1 WEEK AND CURDATE()

Doing it this way should let MySQL use a composite index created on (sales_person_id, available_date) columns (use EXPLAIN to check)


  1. Never do in multiple queries, what you can do in one.

    If you create a derived table/inline view that has the necessary start & end dates, this can be used to produce the desired result in a single query using a GROUP BY. MySQL doesn't have a recursive function, so you have to use the NUMBERS table trick to generate dates...

    1. Create a table that only holds incrementing numbers - easy to do using an auto_increment:

      DROP TABLE IF EXISTS `example`.`numbers`;
      CREATE TABLE  `example`.`numbers` (
       `id` int(10) unsigned NOT NULL auto_increment,
        PRIMARY KEY  (`id`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
      
    2. Populate the table using:

      INSERT INTO NUMBERS (id)
      VALUES (NULL)
      

      ...for as many values as you need.

    3. Use DATE_ADD to construct a list of dates, increasing the days based on the NUMBERS.id value.

      SELECT x.start_dt,
             x.end_dt
        FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt,
                     DATE_ADD(NOW(), INTERVAL n.id + 6 DAY) AS end_dt
                FROM `numbers` n
               WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x
      
    4. JOIN onto your table of data based on the datetime portion:

        SELECT x.start_dt,
               x.end_dt,
               COUNT(*) AS num
          FROM (SELECT DATE_ADD(NOW(), INTERVAL n.id - 1 DAY) AS start_dt,
                       DATE_ADD(NOW(), INTERVAL n.id + 6 DAY) AS end_dt
                  FROM `numbers` n
                 WHERE DATE_ADD(NOW(), INTERVAL (n.id - 1) DAY) <= '2011-01-01') x
          JOIN CUSTOMERS c ON c.available_date BETWEEN x.start_dt
                                                   AND x.end_dt
      GROUP BY x.start_dt, x.end_dt
      
  2. Don't use functions performed upon actual column data - IE: DATEDIFF(NOW(), *available_date*) - because a database can't use an index (if one exists) on the available_date column because the data has been altered away from the index value.


Focus on the WHERE clause.

  • Is there index on fields in the WHERE clause?
  • Can you replace the datediff() function with a constant, it is being evaluated for every row.
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号