Is there an efficient way with given two nodes to find a set of their common nodes (with defined relationships).
For example, having nodes A1
, B1
, C1
-C4
connected with relationships x
and y
:
A1 --x--> C1
A1 --x--> C2
A1 --x--> C3
B1 --y--> C2
B1 --y--> C3
B1 --y--> C4
a common node set for A1(x)
and B1(y)
开发者_Python百科 would be [C2, C3]
.
In Gremlin (http://gremlin.tinkerpop.com), this is expressed as such:
setA._().out('x').in('y').retain(setB).back(2)
Here is what each step does:
- start at setA (A1, A2, A3 in your example).
- start a Gremlin pipeline.
- take the outgoing "x" labeled edges from those setA vertices to C1, C2, and C3.
- take the incoming "y" labeled edges from C1, C2, and C3.
- filter out all steps that are not in setB (thus, only C2 and C3 paths exist).
- go back to what you saw 2 steps ago -- thus, C2 and C3.
Tada!
Good luck, Marko.
http://markorodriguez.com
In many cases the structure of the domain can be leveraged to improve performance. Let's say that you know that in general your A
entities have less x
relationships compared to the number of y
relationships on the B
entities. Then you could traverse two steps from the A node and see where the B
node shows up, and filter out the C
nodes this way. Here's some code for this approach:
Set<Node> found = new HashSet<Node>();
for ( Relationship firstRel : a1.getRelationships( Reltypes.x, Direction.OUTGOING ) )
{
Node cNode = firstRel.getEndNode();
for ( Relationship secondRel : cNode.getRelationships( Reltypes.y, Direction.INCOMING ) )
{
Node bNode = secondRel.getStartNode();
if ( bNode.equals( b1 ) )
{
found.add( cNode );
break;
}
}
}
Another way would be to start two threads that scan the relationships from either side.
A third approach would be to create a specialized index that would help answering this kind of queries, which would obviously hurt insert performance.
精彩评论