Gremlin: What is an effective way to find the line between two peaks?

So, itโ€™s obvious that the direct way to find an edge between two vertices is as follows:

graph.traversal().V(outVertex).bothE(edgeLabel).filter(__.otherV().is(inVertex)) 

I feel that the filter step should iterate around all edges, which makes it very slow for some applications with lots of edges.

Another way:

 traversal = graph.traversal() .V(outVertex) .bothE(edgeLabel) .as("x") .otherV() .is(outVertex) // uses index? .select("x"); 

I assume that the second approach can be much faster, as it will use an ID index, which will make it faster than the first approach.

Which one is faster and more efficient (in terms of IO)?

I use Titan, so you can also make your answer specific with Titan.

Edit

In terms of time, it seems that the first approach is faster (edges were 20k for vertex b

 gremlin> clock(100000){gV(b).bothE().filter(otherV().is(a))} ==>0.0016451789999999999 gremlin> clock(100000){gV(b).bothE().as("x").otherV().is(a).select("x")} ==>0.0018231140399999999 

What about IO?

+2
source share
2 answers

I would expect the first request to be faster. However, a few things:

  • None of the queries is optimal, since both of them allow you to calculate the paths. If you need to find a connection in both directions, use two queries (I will give an example below)
  • When you use clock() , be sure to iterate() your workarounds, otherwise you will only measure how long it takes to do nothing.

These are the queries I would use to find an edge in both directions:

 gV(a).outE(edgeLabel).filter(inV().is(b)) gV(b).outE(edgeLabel).filter(inV().is(a)) 

If you expect to get no more than one edge:

 edge = gV(a).outE(edgeLabel).filter(inV().is(b)).tryNext().orElseGet { gV(b).outE(edgeLabel).filter(inV().is(a)).tryNext() } 

This way you get rid of the calculation of the path. How these queries will largely depend on the underlying graph database. The query optimizer Titan recognizes the query pattern and should return a result with almost no time.

Now, if you want to measure the runtime, do the following:

 clock(100) { gV(a).outE(edgeLabel).filter(inV().is(b)).iterate() gV(b).outE(edgeLabel).filter(inV().is(a)).iterate() } 
+4
source

If someone does not know vertex identifiers, another solution might be

 gV().has('propertykey','value1').outE('thatlabel').as('e').inV().has('propertykey','value2').select('e') 

It is also unidirectional, so you need to reformulate the request in the opposite direction.

+2
source

Source: https://habr.com/ru/post/944086/


All Articles