I am developing a marketing-type system. On the front page, one of the requirements is for sales staff to see the number of sales opportunities they currently have.
ie.
Birthdays | 10
Anniversaries | 15
Introductions | 450
Recurring | 249
The problem is I am UNION
ing all these and the query is taking over 10s in some cases. (We have caching in place so this is only a problem the first time a user logs in for the day).
There is a lot of other criteria involved:
- included in the count should only be the latest one per customer per type (ie. if a customer has two introductions, it should only be counted once - I am using the greatest-n-per-group method of accomplishing this)
- for Birthdays and Anniversaries, the date should be +/- 7 days from today
- for all of them, only records in the past 60 days should be counted
- these records need to be joined with the customers table to make sure that the opportunity's sales person matches the customer's current sales person
Here is the generated query (It is long):
SELECT 'Birthdays' AS `type`, COUNT(*) AS `num`
FROM `opportunities`
INNER JOIN `customers`
ON `opportunities`.`customer_id` = `customers`.`customer_id`
AND `opportunities`.`sales_person_id` = `customers`.`sales_person_id`
LEFT JOIN `opportunities` AS `o2`
ON `opportunities`.`customer_id` = `o2`.`customer_id`
AND `opportunities`.`marketing_message` = `o2`.`marketing_message`
AND opportunities.communication_alert_date < o2.communication_alert_date
WHERE ((`opportunities`.`org_code` = ?))
AND (opportunities.marketing_message = 'Birthday Alert')
AND ((opportunities.communication_alert_date BETWEEN
DATE_SUB(NOW(), INTERVAL 7 DAY) AND DATE_ADD(NOW(), INTERVAL 7 DAY)))
AND (opportunities.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY))
AND (o2.customer_id IS NULL)
UNION ALL
SELECT 'Anniversaries' AS `type`, COUNT(*) AS `num`
FROM `opportunities`
INNER JOIN `customers`
ON `opportunities`.`customer_id` = `customers`.`customer_id`
AND `opportunities`.`sales_person_id` = `customers`.`sales_person_id`
LEFT JOIN `opportunities` AS `o2`
ON `opportunities`.`customer_id` = `o2`.`customer_id`
AND `opportunities`.`marketing_message` = `o2`.`marketing_message`
AND opportunities.communication_alert_date < o2.communication_alert_date
WHERE ((`opportunities`.`org_code` = ?))
AND (opportunities.marketing_message = 'Anniversary Alert')
AND ((opportunities.communication_alert_date BETWEEN
DATE_SUB(NOW(), INTERVAL 7 DAY) AND DATE_ADD(NOW(), INTERVAL 7 DAY)))
AND (opportunities.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY))
AND (o2.customer_id IS NULL)
UNION ALL
SELECT 'Introductions' AS `type`, COUNT(*) AS `num`
FROM `opportunities`
INNER JOIN `customers`
ON `opportunities`.`customer_id` = `customers`.`customer_id`
AND `opportunities`.`sales_person_id` = `customers`.`sales_person_id`
LEFT JOIN `opportunities` AS `o2`
ON `opportunities`.`customer_id` = `o2`.`customer_id`
AND `opportunities`.`marketing_message` = `o2`.`marketing_message`
AND opportunities.communication_alert_date < o2.communication_alert_date
WHERE ((`opportunities`.`org_code` = ?))
AND ((opportunities.Intro_Letter = 'Yes'))
AND (opportunities.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY))
AND (o2.customer_id IS NULL)
UNION ALL
SELECT 'Recurring' AS `type`, COUNT(*) AS `num`
FROM `opportunities`
INNER JOIN `customers`
ON `opportunities`.`customer_id` = `customers`.`customer_id`
AND `opportunities`.`sales_person_id` = `customers`.`sales_person_id`
LEFT JOIN `opportunities` AS `o2`
ON `opportunities`.`customer_id` = `o2`.`customer_id`
AND `opportunities`.`marketing_message` = `o2`.`marketing_message`
AND opportunities.communication_alert_date < o2.communication_alert_date
WHERE ((`opportunities`.`org_code` = ?))
AND ((opportunities.marketing_message != 'Anniv开发者_StackOverflowersary Alert'
AND opportunities.marketing_message != 'Birthday Alert'
AND opportunities.Intro_Letter != 'Yes'))
AND (opportunities.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY))
AND (o2.customer_id IS NULL)
I have indexes on the following fields in the opportunities
table:
- org_code
- customer_id
- Intro_Letter
- marketing_message
- sales_person_id
- org_code, marketing_message
- org_code, Intro_Letter
- org_code, marketing_message, Intro_Letter
Any help optimizing this would be greatly appreciated. I am open to creating other tables or views if need be.
A good place to start would be removing the string comparisons and putting them in a table with assigned IDs and adding numerical columns in the place of
opportunities.marketing_message != 'Birthday Alert'
So you'd have...
[id] [name]
1 Birthday Alert
2 Anniversary
Numerical comparisons are always much quicker even with indexing. Doing this would also allow you to easily add new types in the future.
This part is redundant, you don't need AND (opportunities.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY))
because the clause right before it will do the job.
AND ((opportunities.communication_alert_date BETWEEN
DATE_SUB(NOW(), INTERVAL 7 DAY) AND DATE_ADD(NOW(), INTERVAL 7 DAY)))
AND (opportunities.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY))
I agree with existing comments that the alert text needs to be in a type table, with a foreign key relationship to the OPPORTUNITIES table.
Leave it to Zend to two queries when you only need one:
SELECT CASE
WHEN marketing_message = 'Birthday Alert' THEN 'Birthdays'
WHEN marketing_message = 'Anniversary Alert' THEN 'Anniversaries'
END AS msg,
COUNT(*)
FROM OPPORTUNITIES o
JOIN CUSTOMERS c ON c.customer_id = o.customer_id
AND c.sales_person_id = o.sales_person_id
LEFT JOIN OPPORTUNITIES o2 ON o2.customer_id = o.customer_id
AND o2.marketing_message = o.marketing_message
AND o2.communication_alert_date < o.communication_alert_date
WHERE o.org_code ?
AND o.marketing_message IN ('Birthday Alert', 'Anniversary Alert')
AND o.communication_alert_date BETWEEN DATE_SUB(NOW(), INTERVAL 7 DAY)
AND DATE_ADD(NOW(), INTERVAL 7 DAY)
AND o.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY)
AND o2.customer_id IS NULL
GROUP BY msg
You could make it easier to read by getting rid of all of the grouping parenthesis in the where clause. That would at least make it easier to see whats going on and to optimize
In each subquery you have:
LEFT JOIN `opportunities` AS `o2`
ON `opportunities`.`customer_id` = `o2`.`customer_id`
...
AND (o2.customer_id IS NULL)
That means you want only opportunities o2
that have NULL for customer_id. Because of that these queries can be written with 2 INNER joins instead of 1 OUTER and 1 INNER join what is probably faster.
Something like this:
SELECT `o1`.`Birthdays` AS `type`, COUNT(*) AS `num`
FROM `opportunities` as `o2`
INNER JOIN `opportunities` AS `o1`
ON `o1`.`marketing_message` = `o2`.`marketing_message`
AND o1.communication_alert_date < o2.communication_alert_date
INNER JOIN `customers`
ON `o1`.`customer_id` = `customers`.`customer_id`
AND `o1`.`sales_person_id` = `customers`.`sales_person_id`
WHERE (o2.customer_id IS NULL)
AND (o2.marketing_message = 'Birthday Alert')
AND ((`o1`.`org_code` = ?))
AND ((o1.communication_alert_date BETWEEN
DATE_SUB(NOW(), INTERVAL 7 DAY) AND DATE_ADD(NOW(), INTERVAL 7 DAY)))
AND (o1.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY))
In addition to the answers provided, I replaced the LEFT JOIN with a subquery to return only the most recent instances by type. This seemed to help immensely.
ie (for just the Birthday and Anniversary count):
SELECT
CASE
WHEN marketing_message = 'Birthday Alert' THEN 'Birthdays'
WHEN marketing_message = 'Anniversary Alert' THEN 'Anniversaries'
END AS `type`,
COUNT(*) AS `num`
FROM (
SELECT `opp_sub`.*
FROM (
SELECT `opportunities`.`marketing_message`, `opportunities`.`customer_id`
FROM `opportunities`
INNER JOIN `customers`
ON `opportunities`.`customer_id` = `customers`.`customer_id`
AND `opportunities`.`sales_person_id` = `customers`.`sales_person_id`
WHERE (opportunities.communication_alert_date >= DATE_SUB(NOW(), INTERVAL 60 DAY))
AND (`opportunities`.`dealer_code` = ?)
AND (opportunities.marketing_message IN ('Anniversary Alert', 'Birthday Alert'))
AND (opportunities.communication_alert_date
BETWEEN DATE_SUB(NOW(), INTERVAL 7 DAY)
AND DATE_ADD(NOW(), INTERVAL 7 DAY))
ORDER BY `opportunities`.`communication_alert_date` DESC
) AS `wool_sub`
GROUP BY `customer_id`, `marketing_message`
) AS `c_table`
If you look at this http://dev.mysql.com/doc/refman/5.0/en/using-explain.html You will se that checking your query with the EXPLAIN keyword gives you information about how the query executes. Then you can see exactly where the performance is poor.
精彩评论