开发者

What SQL indexes should I add for this bloated query?

开发者 https://www.devze.com 2023-04-09 20:11 出处:网络
I\'m wondering if indexes will speed this query up. It takes 9 seconds last time I checked.The traffic table has about 300k rows, listings and users 5k rows.I\'m open to ridicule/humiliation too, if t

I'm wondering if indexes will speed this query up. It takes 9 seconds last time I checked. The traffic table has about 300k rows, listings and users 5k rows. I'm open to ridicule/humiliation too, if this is just a crappy query altogether. I wrote it long ago.

It's supposed to get the listings with the most page views (traffic). Let me know if the explanation is lacking.

SELECT traffic_listingid AS listing_id, 
       COUNT(traffic_listingid) AS genuine_hits, 
       COUNT(DISTINCT traffic_ipaddress) AS distinct_ips, 
       users.username, 
       listings.listing_address,
       listings.datetime_created,
       DATEDIFF(NOW(), listings.datetime_created) AS listing_age_days 
FROM traffic 
  LEFT JOIN listings 
    ON traffic.traffic_listingid = listings.listing_id 
  LEFT JOIN开发者_运维知识库 users 
    ON users.id = listings.seller_id 
WHERE traffic_genuine = 1 
  AND listing_id IS NOT NULL 
  AND username IS NOT NULL 
  AND DATEDIFF(NOW(), traffic_timestamp) < 24 
GROUP BY traffic_listingid 
ORDER BY distinct_ips DESC
LIMIT 10

P.S.

ENGINE=MyISAM /
MySQL Server 4.3


Sidenotes:

1.You have

  LEFT JOIN listings 
    ON traffic.traffic_listingid = listings.listing_id 
  ...
WHERE ... 
  AND listing_id IS NOT NULL 

This condition cancels the LEFT JOIN. Change your query into:

  INNER JOIN listings 
    ON traffic.traffic_listingid = listings.listing_id 

and remove the listing_id IS NOT NULL from the WHERE conditions.

The same thing applies to LEFT JOIN user and username IS NOT NULL.


2.The check on traffic_timestamp:

      DATEDIFF(NOW(), traffic_timestamp) < 24

makes it difficult for the index to be used. Change it into something like this that can use an index
(and check that my version is equivalent, I may have mistakes):

      traffic_timestamp >= CURRENT_DATE() - INTERVAL 23 DAY

3.The COUNT(non-nullable-column) is equivalent to COUNT(*). Change the:

   COUNT(traffic_listingid) AS genuine_hits, 

to:

   COUNT(*) AS genuine_hits, 

because it's bit faster in MySQL (although I'm not sure about that for version 4.3)


For the index question, you should have at least an index on every column that is used for joining. Adding one more for the traffic_timestamp will probably help, too.

If you tell us in which tables the traffic_ipaddress and traffic_timestamp are, and what the EXPLAIN EXTENDED shows, someone may have a better idea.

Reading again the query, it seems that it's actually a GROUP BY only in table traffic and the other 2 tables are used to get refrence data. So, the query is equivalent to a (traffic group by)-join-listing-join-user. Not sure if that helps in your MySQL old version but it may be good to have both versions of the query and test if one query runs faster in your system.


Indexes should always be put on columns you use in the where clause.

In this case the listingid looks like a good option, as well as the users.id, seller_id and traffic_timestamp.

Use a EXPLAIN EXTENDED in front of your query to see what MySQL recommends you (It shows how many rows are touched, and what indexes it uses)

0

精彩评论

暂无评论...
验证码 换一张
取 消