I'm wondering if indexes will speed this query up. It takes 9 seconds last time I checked. The traffic table has about 300k rows, listings and users 5k rows. I'm open to ridicule/humiliation too, if this is just a crappy query altogether. I wrote it long ago.
It's supposed to get the listings with the most page views (traffic). Let me know if the explanation is lacking.
SELECT traffic_listingid AS listing_id,
COUNT(traffic_listingid) AS genuine_hits,
COUNT(DISTINCT traffic_ipaddress) AS distinct_ips,
users.username,
listings.listing_address,
listings.datetime_created,
DATEDIFF(NOW(), listings.datetime_created) AS listing_age_days
FROM traffic
LEFT JOIN listings
ON traffic.traffic_listingid = listings.listing_id
LEFT JOIN开发者_运维知识库 users
ON users.id = listings.seller_id
WHERE traffic_genuine = 1
AND listing_id IS NOT NULL
AND username IS NOT NULL
AND DATEDIFF(NOW(), traffic_timestamp) < 24
GROUP BY traffic_listingid
ORDER BY distinct_ips DESC
LIMIT 10
P.S.
ENGINE=MyISAM /
MySQL Server 4.3
Sidenotes:
1.You have
LEFT JOIN listings
ON traffic.traffic_listingid = listings.listing_id
...
WHERE ...
AND listing_id IS NOT NULL
This condition cancels the LEFT JOIN
. Change your query into:
INNER JOIN listings
ON traffic.traffic_listingid = listings.listing_id
and remove the listing_id IS NOT NULL
from the WHERE
conditions.
The same thing applies to LEFT JOIN user
and username IS NOT NULL
.
2.The check on traffic_timestamp
:
DATEDIFF(NOW(), traffic_timestamp) < 24
makes it difficult for the index to be used. Change it into something like this that can use an index
(and check that my version is equivalent, I may have mistakes):
traffic_timestamp >= CURRENT_DATE() - INTERVAL 23 DAY
3.The COUNT(non-nullable-column)
is equivalent to COUNT(*)
. Change the:
COUNT(traffic_listingid) AS genuine_hits,
to:
COUNT(*) AS genuine_hits,
because it's bit faster in MySQL (although I'm not sure about that for version 4.3)
For the index question, you should have at least an index on every column that is used for joining. Adding one more for the traffic_timestamp
will probably help, too.
If you tell us in which tables the traffic_ipaddress
and traffic_timestamp
are, and what the EXPLAIN EXTENDED
shows, someone may have a better idea.
Reading again the query, it seems that it's actually a GROUP BY
only in table traffic
and the other 2 tables are used to get refrence data. So, the query is equivalent to a (traffic group by)-join-listing-join-user. Not sure if that helps in your MySQL old version but it may be good to have both versions of the query and test if one query runs faster in your system.
Indexes should always be put on columns you use in the where clause.
In this case the listingid looks like a good option, as well as the users.id, seller_id and traffic_timestamp.
Use a EXPLAIN EXTENDED
in front of your query to see what MySQL recommends you (It shows how many rows are touched, and what indexes it uses)
精彩评论