I have a directed graph described by A -> B
meaning that there exists a connection, always weight 1, FROM A TO B. The problem I want to solve (This is not an academic project) is how can I tell how many common connections there are between two nodes.
To say that in terms of A and B.
There are 2 things that need to get done, * To look at all of my, B, links coming in (All A's to some B) * Count how many common A's out of all My, Original B, A's.I do not know if that makes sense but I'll show you how far I have come.
* First point.SELECT A
FROM graph
WHERE B='myid';
As most can tell, part 1 is a very simple question. Part 2 is where things get tricky.
I have been able to get all the A's with at least 1 connection or more similar.Second point.
SELECT G.A, count( G2.A ) AS common
FROM graph AS G2
JOIN (
开发者_开发百科 SELECT A, B
FROM graph
WHERE B = 'myid'
) AS G ON G.A = G2.B
So the second point is close because it will return all common links, but it will not return all links which have no common links. Is there a way to get that too?
There is still confusion: Ill try to draw a picture... with words.
Here is the table.A, B
-----
2, 1
3, 1
2, 3
If I wanted to see how many common links from all incoming links into NODE 1 I should see
A, count
---------
2, 1 // This is for 2's connection to 3.
3, 0
With the current SQL statement I have I see this.
A, count
---------
2, 1 // This is for 2's connection to 3.
Instead of using a subquery, I would just use JOINs:
SELECT
N1.A,
COUNT(N3.A)
FROM
Nodes N1
INNER JOIN Nodes N2 ON
N2.B = 'myid'
LEFT OUTER JOIN Nodes N3 ON
N3.A = N1.A AND
N3.B = N2.A
WHERE
N1.B = 'myid'
GROUP BY
N1.A
This gives the answer that you are expecting in your example. You may want to test it out on a wider sample.
You should also do performance testing on this if your data set is of any significant size.
精彩评论