Optimizing COUNT(*) query with 4 MM relations and larger table_问答_开发者

Optimizing COUNT(*) query with 4 MM relations and larger table

开发者 https://www.devze.com 2022-12-08 19:38 出处：网络

I am struggling (again) with the table from this question: how to optimize this query (4 mm tables involved)

相关专题：optimization

I am struggling (again) with the table from this question: how to optimize this query (4 mm tables involved)

It is one main table product_table which has four MM relations via the lookup tables mm1 to mm4. The lookup tables have the fields uid_local containing the uid of the product_table and uid_foreign containing the uid of the categories by which the result gets filtered.

After Quassnois suggestion from the above mentioned question which improved the performance the query looks like this:

SELECT
COUNT(*)
FROM
product_table
WHERE
(product_table.partner_id = 0 OR product_table.partner_id = 15) AND
EXISTS(SELECT NULL FROM mm1 WHERE mm1.uid_local = product_table.uid AND mm1.uid_foreign IN (10, 11, 12, 13, 14, 53)) AND
EXISTS(SELECT NULL FROM mm2 WHERE mm2.uid_local = product_table.uid AND mm2.uid_foreign IN (7, 8, 9)) AND
EXISTS(SELECT NULL FROM mm3 WHERE mm3.uid_local = product_table.uid AND mm3.uid_foreign IN (51 ,52)) AND
EXISTS(SELECT NULL FROM mm4 WHERE mm4.uid_local = product_table.uid AND mm4.uid_foreign IN (15, 16, 17, 18, 19))

This returns somewhat around 10.000 rows as COUNT(*) but it takes > 0.5 seconds for the query to execute. This is too slow.

EXPLAIN looks like this:

id  select_type         table           type  possible_keys           key         key_len   ref       rows    Extra
1   PRIMARY             product_table   re开发者_StackOverflow社区f   NewIndex                NewIndex    4         const     9430    Using where
5   DEPENDENT SUBQUERY  mm1             ref   uid_local,uid_foreign   uid_local   4         mm1.uid   5       Using where
4   DEPENDENT SUBQUERY  mm2             ref   uid_local,uid_foreign   uid_local   4         mm2.uid   2       Using where
3   DEPENDENT SUBQUERY  mm3             ref   uid_local,uid_foreign   uid_local   4         mm3.uid   3       Using where
2   DEPENDENT SUBQUERY  mm4             ref   uid_local,uid_foreign   uid_local   4         mm4.uid   6       Using where

If I change the product_table.partner_id = 0 to something that returns less rows e. g. a few hundred or so the query is very fast (0.015 sec).

So the problem is that the COUNT(*) query is very fast if it needs to count only a few rows (100 - 200) but very slow on counting a larger result set (> 10.000 rows).

Any ideas on how to fix this?

A few facts:

the mm tables have indexes on uid_local and uid_foreign
The product_table has a indexed PK and there is another index on partner_id
The product_table table contains around 30.000 rows

EDIT

The IN clauses are not fixed, the query could instead of IN (7, 8, 9) look like IN (7, 9) or other combination of numbers.

You could use materialized views for the EXISTS parts of your query. This way, you could fold the IN statements into a single query. Then you just need to create an index on uid and the query should take almost no time anymore:

SELECT
COUNT(*)
FROM
product_table
WHERE
(product_table.partner_id = 0 OR product_table.partner_id = 15) AND
EXISTS(SELECT NULL FROM vmm1 WHERE vmm1.uid_local = product_table.uid) AND
EXISTS(SELECT NULL FROM vmm2 WHERE vmm2.uid_local = product_table.uid) AND
EXISTS(SELECT NULL FROM vmm3 WHERE vmm3.uid_local = product_table.uid) AND
EXISTS(SELECT NULL FROM vmm4 WHERE vmm4.uid_local = product_table.uid)

The drawback: If you change the mm-tables a lot, the view will need to be updated and that will make the changes to those tables slower.

take a look at FORCE INDEX in the mysql manual. if you have 10k rows in your M2M table that match, the optimizer might have decided that scanning the table is better than consulting the index, but in your case, probably not.