I have the following which when run on its own is very quick, but when I am performing this for many entity_id
the queries start to take longer and longer (the loop is a PHP foreach) for example this query only takes 0.078 but the same query on a different entity within the loop on takes upto 2.1 seconds, the queries seem to get slower and slower the more entities I put in the loop. Why is this? and how can I improve/optimize the query?
foreach($entity_ids as $entity_id) {
SELECT COUNT(*) as prev, DATE_FORMAT(`created`, '%Y%m%d') AS date_group
FROM articles_entities
WHERE entity_id = '$entity_id'
AND `created` >= DATE_SUB(CURDATE(), INTERVAL 10 DAY)
GROUP BY date_group
// store result
}
I have the following table structure:
CREATE TABLE `articles_entities` (
`id` CHAR(36) NOT NULL,
`article_id` CHAR(36) NOT NULL,
`entity_id` CHAR(36) NOT NULL,
`created` DATETIME DEFAULT NULL,
`modified` DATETIME DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `article_id` (`article_id`),
KEY `entity_id` (`entity_id`),
KEY `created` (`created`)
) ENGI开发者_JAVA技巧NE=MYISAM DEFAULT CHARSET=utf8;
It looks to me like you have an array of ID's, then want to pull records from your table (in a way controlled by your statement) where the ID field matches one of the array values.
Instead of using a LOOP in PHP to run multiple SQL statements, the best thing to do is build one master statement then use PHP to handle the results. You can accomplish this using the SQL IN statement:
// where $entity_ids is an array eg 1,2,3,4,5
$sql="SELECT entity_id AS 'alt_entity_id', COUNT(entity_id) as prev, DATE_FORMAT(`created`, '%Y%m%d') AS 'date_group'
FROM articles_entities
WHERE entity_id IN ".implode(",",$entity_ids)."
AND `created` >= DATE_SUB(CURDATE(), INTERVAL 10 DAY)
GROUP BY date_group, entity_id";
// you may wish to revese the group fields, as you require, you may also wish to change the count field to date_group, depending on what you wish to be counted
This will run the original query one time, for all the id values you have, grouped by both date AND the id value passed. You can then use PHP to filter out the results for the specific id from the returned resultset.
This is far more efficient than the overhead produced by looping the execution of a query.
Your returned resultset will look something like:
entity_id | count(entity_id) | date_group
----------|------------------|------------
1 | 3 | 2010-04-01
1 | 3 | 2010-03-01
1 | 3 | 2010-02-01
2 | 2 | 2010-01-01
2 | 2 | 2010-02-01
3 | 1 | 2010-06-01
4 | 2 | 2010-06-01
4 | 2 | 2010-02-01
I don't know where you're getting the entity values for your loop, but executing this query within a loop will always be a big performance overhead. If you're getting the entity_ids from a previous SQL query, it might make more sense to refactor your SQL to join the initial query with the loop query so that you're returning all the data you need in a single SQL query.
Roughly how many entities are you dealing with?
Could you insert the required entites into a seperate table and do a join, rather than have multiple queries?
Take all ids into an array, join it to form a string and use "where in" to get the details in optimized way
$enitityIDS = array();
foreach($entity_ids as $entity_id) {
$enitityIDS[]=$entity_id;
}
$entityIDString = join(",",$enitityIDS);
then do
SELECT COUNT(*) as prev, DATE_FORMAT(`created`, '%Y%m%d') AS date_group
FROM articles_entities
WHERE entity_id in (".$entityIDString.")
AND `created` >= DATE_SUB(CURDATE(), INTERVAL 10 DAY)
GROUP BY date_group, entity_id
Optimal Way
精彩评论