I have a database that contains 2 tables:
Users (user_id primary key), and Friends
The friends table is organized into 2 columns friend1, friend2, both of which contain user_ids as (foreign keys referencing Users). In each friend pair, friend1's user id is less than friend 2's user id.
I am trying to find a list of users who are not friends but who share the greatest amount of friends.
I have managed to do this in two separate queries:
Number of Shared Friends for users u1 and u2
select count(*)
from
((select friend1 from friends where friend2 = u1 UNION
select friend2 from friends where friend1 = u1)
INTERSECT
(select friend1 from friends where friend2 = u2 UNION
select friend2 from friends where friend1 = u2))
;
Set of all user_id -> user_id pairs who are not friends:
select distinct
u1.user_id as friend1,
u2.user_id as friend2
from
users u1,
users u2
where
u1.user_id < u2.user_id
minus
select friend1, friend开发者_如何学Go2
from friends order by friend1;
However my ultimate goal is to get a result that is
user1 user2 shared_friends
such that user1 < user2, and user1 and user2 are not friends, and shared_friends is the count of how many friends these users have in common, which I have thus far been unsuccessful in achieving.
This should get you started, but it is Microsoft SQL (although pretty generic)
select a1.friend1 as User1,
a2.friend1 as user2,
count( distinct a1.friend2) as Shared_Friends
from friends a1
join (select distinct friend1,friend2 from friends a2) a2
on a1.friend2=a2.friend2
left join friends a3 on a3.friend1=a1.friend1 and a3.friend2 = a2.friend1
where (a1.friend1=1 and a2.friend1=8) and a3.friend1 is null
group by a1.friend1,a2.friend1
The CTE's are just to provide some sample data.
With Users As
(
Select 1 As UserId, 'Alice' As Name
Union All Select 2, 'Bob'
Union All Select 3, 'Caroline'
Union All Select 4, 'Doug'
)
, Friends As
(
Select 1 As Friend1, 2 As Friend2
Union All Select 2, 1
Union All Select 2, 3
Union All Select 2, 4
Union All Select 3, 1
Union All Select 3, 4
)
, UserFriends As
(
Select U1.UserId
, Case
When F1.Friend1 = U1.UserId Then F1.Friend2
Else F1.Friend1
End As Friend
From Users As U1
Join Friends As F1
On U1.UserId In(F1.Friend1,F1.Friend2)
Group By U1.UserId
, Case
When F1.Friend1 = U1.UserId Then F1.Friend2
Else F1.Friend1
End
)
Select U1.Name, U2.Name
, Count(*) As MutualFriendCount
, Group_Concat(F.Name) As SharedFriends
From UserFriends As UF1
Join UserFriends As UF2
On UF2.Friend = UF1.Friend
Join Users As U1
On U1.UserId = UF1.UserId
Join Users As U2
On U2.UserId = UF2.UserId
Join Users As F
On F.UserId = UF1.Friend
And F.UserId = UF2.Friend
Where UF1.UserId <> UF2.UserId
And Not Exists (
Select 1
From UserFriends As F3
Where F3.UserId = UF1.UserId
And F3.Friend = UF2.UserId
)
Group By U1.Name, U2.Name
Order By U1.Name, U2.Name
精彩评论