How do I combine SELECT statements to allow me to calculate percentages, successes and failures in SQL Server?_问答_开发者

Imagine a table :

CUST_PROMO (customer_id,PROMOTION) which is used as a mapping between every promotion that customer have received.

select promotion, count(customer_id) as promo_size
from CUST_PROMO
group by promotion

This gets us the total number of customers in each promotion.

Now, we've got CUSTOMER (customer_id, PROMO_RESPONDED,PROMO_PURCHASED), which lists the customer and which promotion got the customer to respond, and which got them to purchase.

select PROMO_RESPONDED, count(customer_id) as promo_responded 
from CUSTOMER
group by PROMO_RESPONDED

select PROMO_PURCHASED,count(customer_id) as promo_responded 
from CUSTOMER 
group by PROMO_PURCHASED

This is all very self-explanatory; now I've got the number of people for whom each promo was successful.

But; what I'd like to end up with is [in CSV form]

PROMOTION,PROMO_SIZE,PROMO_RESPONDED,PROMO_PURCHASED,PROMO_RESPSUCCESSRATE,blah

1,100,12,5,12%,...
2,200,23,14,11.5%,...

I have no idea how to do this. I can UNION the three queries above; but that doesn't actually result in what I want. I thought about creating an in-memory table, inserting in each promo value and then doing an update statement with a join against it to set the values each -- but that's pretty messy; and requires a new UPDATE statement for each table/select statement. I could also make a temp table per result set and then join them together; but really; who wants to do that?

I can't think of any way of joining this data that makes any sense; since I'm dealing with aggregates.

So, at best, I need a function that, like UNION, will combine result sets, but 开发者_运维百科will actually combine like columns on a key and ADD those columns rather than union which adds rows. The description makes it sound like a JOIN; but I can't see that working.

Thanks for the help!

SELECT
  cp.promotion,
  PROMO_SIZE = COUNT(*),
  PROMO_RESPONDED = COUNT(c1.customer_id),
  PROMO_PURCHASED = COUNT(c2.customer_id),
  PROMO_RESPSUCCESSRATE = COUNT(c1.customer_id) * 100.0 / COUNT(*)
FROM CUST_PROMO cp
  LEFT JOIN CUSTOMER c1
    ON cp.customer_id = c1.customer_id AND cp.promotion = c1.PROMO_RESPONDED
  LEFT JOIN CUSTOMER c2
    ON cp.customer_id = c2.customer_id AND cp.promotion = c2.PROMO_PURCHASED
GROUP BY cp.promotion

WITH tmp AS
(
    SELECT  PROMOTION, 0 as promo_responded, 0 as promo_purchased, COUNT(customer_id) as total
    FROM    CUST_PROMO
    GROUP BY PROMOTION
    SELECT  PROMOTION, COUNT(customer_id) as promo_responded, 0 as promo_purchased, 0 as total
    FROM    CUSTOMER
    GROUP BY PROMO_RESPONDED
    UNION   
    SELECT  PROMOTION, COUNT(customer_id) as promo_purchased, 0 as promo_responded, 0 as total
    FROM    CUSTOMER
    GROUP BY PROMO_PURCHASED
)
SELECT  PROMOTION, SUM(promo_responded) as TotalResponded, SUM(promo_purchased) as TotalPurchased, SUM(Total) as TotalSize,
        SUM(promo_responded)/SUM(Total) as ResponseRate, SUM(promo_purchased)/SUM(Total) as PurchaseRate
FROM    tmp

Does this work? I'm not sure about division and multiplication operators, but I belive my logic is good.The key is using corelated select substatements in the select statement.

SELECT c.promotion, 
       COUNT(c.customer_id) as promo_size, 
       (SELECT COUNT(customer_id)
          FROM CUSTOMER
         WHERE PROMO_RESPONDED = c.promotion) PROMO_RESPONDED,
       (SELECT COUNT(customer_id)
          FROM CUSTOMER
         WHERE PROMO_PURCHASED = c.promotion) PROMO_PURCHASED,
       (SELECT COUNT(customer_id) *100/count(c.customer_id)
         FROM CUSTOMER
        WHERE PROMO_RESPONDED = c.promotion)                                                      
FROM CUST_PROMO c
GROUP BY c.promotion

A cleaner solution using decode. Still not sure the math is working

select PROMOTION, count(CUSTOMER_ID) as promo_size, 
       SUM(DECODE(PROMO_RESPONDED, PROMOTION, 1, 0)) PROMO_RESPONDED,
       SUM(DECODE(PROMO_PURCHASED, PROMOTION, 1, 0)) PROMO PURCHASED,
       SUM(DECODE(PROMO_RESPONDED, PROMOTION, 1, 0))*100/count(CUSTOMER_ID) PROMO_RESPONDED
from CUST_PROMO join CUSTOMER using CUSTOMER_ID
group by PROMOTION

Yes, I think JOINing the three aggregate queries is the way to go. The LEFT JOINs are there just in case some promotion get no response or no purchases.

I also changed the COUNT(customer_id) to COUNT(*). The result is the same, unless customer_id field can have NULL values in the two tables which most probably is not the case. If however, a customer may appear in two rows of a table with same promotion code, then you should change those into COUNT(DISTINCT customer_id) :

SELECT prom.promotion
     , prom.promo_size
     , responded.promo_responded
     , purchased.promo_purchased
     , responded.promo_responded / prom.promo_size
       AS promo_response_success_rate
FROM
    ( SELECT promotion
         , COUNT(*) AS promo_size
      FROM CUST_PROMO
      GROUP BY promotion
    ) AS prom
  LEFT JOIN 
    ( SELECT PROMO_RESPONDED AS promotion
           , COUNT(*) AS promo_responded
      FROM CUSTOMER
      GROUP BY PROMO_RESPONDED
    ) AS responded
    ON responded.promotion = prom.promotion
  LEFT JOIN
    ( SELECT PROMO_PURCHASED AS promotion
           , COUNT(*) AS promo_purchased
      FROM CUSTOMER
      GROUP BY PROMO_PURCHASED
    ) AS purchased
    ON purchased.promotion = prom.promotion