i have a transaction table like this
cust_id | item --------------------- 1 | book 1 | shoe 2 | book 3 | shoe
how can i use SQL com开发者_如何学编程mand to find customer who bought book and shoe together? so, the result can be only customer 1. I need to do Apriori project, iv try to google, but i dunno the correct phrase to be googled, please help me, thanks a lot.
You can use a self-join:
select a.cust_id
from transaction a, transactions b
where a.cust_id = b.cust_id
and a.item = 'book'
and b.item = 'shoe'
The use of two references to the transactions
table creates a cartesian product (every row combined with every row) of all book
and shoe
buyers which will, when limited by the a.cust_id = b.cust_id
clause, exclude all those that only bought one of them, as below, where only customer 1 satisfies all the requirements (X 'book' X 'shoe'
, where X
is any customer ID, but the same in both columns 1 and 3 of the result set):
a.cust_id | a.item | b.cust_id | b.item | include?
----------+--------+-----------+--------+---------
1 | book | 1 | book |
1 | shoe | 1 | book |
2 | book | 1 | book |
3 | shoe | 1 | book |
1 | book | 1 | shoe | yes
1 | shoe | 1 | shoe |
2 | book | 1 | shoe |
3 | shoe | 1 | shoe |
1 | book | 2 | book |
1 | shoe | 2 | book |
2 | book | 2 | book |
3 | shoe | 2 | book |
1 | book | 3 | shoe |
1 | shoe | 3 | shoe |
2 | book | 3 | shoe |
3 | shoe | 3 | shoe |
There's probably a way to do this with an explicit join
statement as well but, since I use a DBMS that's smart enough to figure it out, I don't care. I won't state which DBMS since I don't want to start a religious war :-)
SELECT cust_id FROM table WHERE item='book'
INTERSECT
SELECT cust_id FROM table WHERE item='shoe'
http://ideone.com/V74E4
精彩评论