开发者

SQL Optimization

开发者 https://www.devze.com 2023-01-22 14:35 出处:网络
How to optimize this? SELECT e.attr_id, e.sku, a.value FROM product_attr AS e, product_attr_text AS a WHERE e.attr_id = a.attr_id

How to optimize this?

SELECT e.attr_id, e.sku, a.value

FROM product_attr AS e, product_attr_text AS a

WHERE e.attr_id = a.attr_id
AND value
IN (
    SELECT value
    FROM product_attr_text 
    WHERE attribute_id = (
        SELECT attribute_id
        FROM eav_attr 
        WHERE attribute_code = 'similar_prod_id' 
    ) 
    AND value != ''

    GROUP BY value
    HAVING (COUN开发者_开发知识库T( value ) > 1 )
)


SELECT  e.attr_id, e.sku, a.value
FROM    (
        SELECT  pat.value
        FROM    eav_attr ea
        JOIN    product_attr_text pat
        ON      pat.attribute_id = ea.attribute_id
        WHERE   ea.attribute_code = 'similar_prod_id'
                AND value <> ''
        GROUP BY
                value
        HAVING  COUNT(*) > 1
        ) q
JOIN    product_attr_text AS a
ON      a.value = q.value
JOIN    product_attr AS e
ON      e.attr_id  = a.attr_id

Create indexes:

 eav_attr (attribute_code)
 product_attr_text (attribute_id, value)
 product_attr_text (value)
 product_attr (attr_id)


Change it into a JOIN. MySQL doesn't optimize well subqueries in an IN() clause - it is recalculated per row (very inefficient for many rows)

SELECT e.attr_id, e.sku, a.value
FROM product_attr AS e
INNER JOIN product_attr_text AS a ON e.attr_id = a.attr_id
INNER JOIN (SELECT value
    FROM product_attr_text 
    INNER JOIN eav_attr ON eav_attr.attribute_id=product_attr_text.attribute_id
    WHERE attribute_code = 'similar_prod_id'  
    AND value != ''
    GROUP BY value
    HAVING COUNT( value ) > 1 
) AS filter ON filter.value=a.value

After you've converted the query (you may need to make some corrections according to your schema), run EXPLAIN on the query and index accordingly.


Difficult to answer ...

The only thing I can say is:

  • look at the execution plan

  • use IO statistics-

and take a look what part of the statement is problematic, and why. Find out if you can improve performance by adding an index, etc...

Next to that, I don't understand why you're using a group by clause ...

And, why do you use subqueries, instead of joining tables ? (Although I don't think that it will make much difference, since the SQL Server optimizer should be smart enough to generate the same execution plan).


Change the second inner select into a join:

SELECT e.attr_id, e.sku, a.value
FROM product_attr AS e, product_attr_text AS a
WHERE e.attr_id = a.attr_id
AND value
IN (
    SELECT at.value
    FROM product_attr_text at, eav_attr eat 
    WHERE at.attribute_id = eat.attribute_id 
          AND eat.attribute_code = 'similar_prod_id' 
          AND value != ''
    GROUP BY value
    HAVING (COUNT( value ) > 1 )
)

Then, take a look to the execution plan to see how the query is solved. You might need additional optimizations based on that execution plan (e.g. make sure the attribute_id field is indexed).


frederik answer is the right one, but i'd like to suggest a few point :

  • avoid IN, use EXISTS instead
  • avoid = (SELECT something from sometable), use exists instead
  • useless group by
  • use inner join instead of multiple table in from clause

but again, it depends on your machine/dbms/version etc... so to get the best performance you'll have to compare the different execution plan


Caution: Be wary of designs that use EAV (entity, attribute, value) table designs. They tend to become very difficult to handle, for reasons of referential integrity and query complexity.

Optimize by using:

  • Joins instead of sub-selects.
  • Join notation.
  • Explicit table aliases on all column names.

Phase 1:

SELECT e.attr_id, e.sku, a.value
  FROM product_attr AS e JOIN product_attr_text AS a
       ON e.attr_id = a.attr_id
 WHERE a.value IN (
          SELECT p.value
            FROM product_attr_text AS p
            JOIN eav_attr AS v ON p.attribute_id = v.attribute_id
           WHERE v.attribute_code = 'similar_prod_id'
             AND p.value != ''
           GROUP BY value
          HAVING (COUNT( value ) > 1)
          )

Phase 2:

SELECT e.attr_id, e.sku, a.value
  FROM product_attr AS e
  JOIN product_attr_text AS a ON e.attr_id = a.attr_id
  JOIN (SELECT p.value
          FROM product_attr_text AS p
          JOIN eav_attr AS v ON p.attribute_id = v.attribute_id
         WHERE v.attribute_code = 'similar_prod_id'
           AND p.value != ''
         GROUP BY value
        HAVING (COUNT( value ) > 1)
       ) AS x ON x.value = a.value


use a derived table for the group by having... and join that back on whatever table(s) you need to pad out the result

0

精彩评论

暂无评论...
验证码 换一张
取 消