Imagine a table :
CUST_PROMO (customer_id,PROMOTION) which is used as a mapping between every promotion that customer have received.
select promotion, count(customer_id) as promo_size
from CUST_PROMO
group by promotion
This gets us the total number of customers in each promotion.
Now, we've got CUSTOMER (customer_id, PROMO_RESPONDED,PROMO_PURCHASED), which lists the customer and which promotion got the customer to respond, and which got them to purchase.
select PROMO_RESPONDED, count(customer_id) as promo_responded
from CUSTOMER
group by PROMO_RESPONDED
select PROMO_PURCHASED,count(customer_id) as promo_responded
from CUSTOMER
group by PROMO_PURCHASED
This is all very self-explanatory; now I've got the number of people for whom each promo was successful.
But; what I'd like to end up with is [in CSV form]
PROMOTION,PROMO_SIZE,PROMO_RESPONDED,PROMO_PURCHASED,PROMO_RESPSUCCESSRATE,blah
1,100,12,5,12%,...
2,200,23,14,11.5%,...
I have no idea how to do this. I can UNION the three queries above; but that doesn't actually result in what I want. I thought about creating an in-memory table, inserting in each promo value and then doing an update statement with a join against it to set the values each -- but that's pretty messy; and requires a new UPDATE statement for each table/select statement. I could also make a temp table per result set and then join them together; but really; who wants to do that?
I can't think of any way of joining this data that makes any sense; since I'm dealing with aggregates.
So, at best, I need a function that, like UNION, will combine result sets, but 开发者_运维百科will actually combine like columns on a key and ADD those columns rather than union which adds rows. The description makes it sound like a JOIN; but I can't see that working.
Thanks for the help!
SELECT
cp.promotion,
PROMO_SIZE = COUNT(*),
PROMO_RESPONDED = COUNT(c1.customer_id),
PROMO_PURCHASED = COUNT(c2.customer_id),
PROMO_RESPSUCCESSRATE = COUNT(c1.customer_id) * 100.0 / COUNT(*)
FROM CUST_PROMO cp
LEFT JOIN CUSTOMER c1
ON cp.customer_id = c1.customer_id AND cp.promotion = c1.PROMO_RESPONDED
LEFT JOIN CUSTOMER c2
ON cp.customer_id = c2.customer_id AND cp.promotion = c2.PROMO_PURCHASED
GROUP BY cp.promotion
WITH tmp AS
(
SELECT PROMOTION, 0 as promo_responded, 0 as promo_purchased, COUNT(customer_id) as total
FROM CUST_PROMO
GROUP BY PROMOTION
SELECT PROMOTION, COUNT(customer_id) as promo_responded, 0 as promo_purchased, 0 as total
FROM CUSTOMER
GROUP BY PROMO_RESPONDED
UNION
SELECT PROMOTION, COUNT(customer_id) as promo_purchased, 0 as promo_responded, 0 as total
FROM CUSTOMER
GROUP BY PROMO_PURCHASED
)
SELECT PROMOTION, SUM(promo_responded) as TotalResponded, SUM(promo_purchased) as TotalPurchased, SUM(Total) as TotalSize,
SUM(promo_responded)/SUM(Total) as ResponseRate, SUM(promo_purchased)/SUM(Total) as PurchaseRate
FROM tmp
Does this work? I'm not sure about division and multiplication operators, but I belive my logic is good.The key is using corelated select substatements in the select statement.
SELECT c.promotion,
COUNT(c.customer_id) as promo_size,
(SELECT COUNT(customer_id)
FROM CUSTOMER
WHERE PROMO_RESPONDED = c.promotion) PROMO_RESPONDED,
(SELECT COUNT(customer_id)
FROM CUSTOMER
WHERE PROMO_PURCHASED = c.promotion) PROMO_PURCHASED,
(SELECT COUNT(customer_id) *100/count(c.customer_id)
FROM CUSTOMER
WHERE PROMO_RESPONDED = c.promotion)
FROM CUST_PROMO c
GROUP BY c.promotion
A cleaner solution using decode. Still not sure the math is working
select PROMOTION, count(CUSTOMER_ID) as promo_size,
SUM(DECODE(PROMO_RESPONDED, PROMOTION, 1, 0)) PROMO_RESPONDED,
SUM(DECODE(PROMO_PURCHASED, PROMOTION, 1, 0)) PROMO PURCHASED,
SUM(DECODE(PROMO_RESPONDED, PROMOTION, 1, 0))*100/count(CUSTOMER_ID) PROMO_RESPONDED
from CUST_PROMO join CUSTOMER using CUSTOMER_ID
group by PROMOTION
Yes, I think JOIN
ing the three aggregate queries is the way to go. The LEFT JOIN
s are there just in case some promotion get no response or no purchases.
I also changed the COUNT(customer_id)
to COUNT(*)
. The result is the same, unless customer_id
field can have NULL
values in the two tables which most probably is not the case. If however, a customer may appear in two rows of a table with same promotion code, then you should change those into COUNT(DISTINCT customer_id)
:
SELECT prom.promotion
, prom.promo_size
, responded.promo_responded
, purchased.promo_purchased
, responded.promo_responded / prom.promo_size
AS promo_response_success_rate
FROM
( SELECT promotion
, COUNT(*) AS promo_size
FROM CUST_PROMO
GROUP BY promotion
) AS prom
LEFT JOIN
( SELECT PROMO_RESPONDED AS promotion
, COUNT(*) AS promo_responded
FROM CUSTOMER
GROUP BY PROMO_RESPONDED
) AS responded
ON responded.promotion = prom.promotion
LEFT JOIN
( SELECT PROMO_PURCHASED AS promotion
, COUNT(*) AS promo_purchased
FROM CUSTOMER
GROUP BY PROMO_PURCHASED
) AS purchased
ON purchased.promotion = prom.promotion
精彩评论