Do time stamps make search effects look?

Question

Do time stamps make search effects look?

I am using neo4j 2.1.7 I recently experimented with query matches, looking for nodes with multiple labels. And I found out that usually a request

Match (p:A:B) return count(p) as number

and

 Match (p:B:A) return count(p) as number

It works at different times, especially in cases where you have, for example, 2 million N nodes and 0 nodes B. So, do the labels arrange the time for searching for effects? Is this future documented anywhere?

+6

neo4j cypher

Evgen Feb 10 '15 at 14:28

source share

1 answer

Stefan armbruster · Accepted Answer · 2015-02-10T15:21:56+0000

Neo4j internally supports label storage - it is basically a search to quickly get all the nodes that carry a specific label A

When executing a query like

 MATCH (n:A:B) return count(n)

labelcanstore is used to search for all nodes of A , and then they are filtered if these nodes carry label B If n(A) >> n(B) more efficient to do MATCH (n:B:A) since you are only looking at a few nodes of B and filtering them for A.

You can use PROFILE MATCH (n:A:B) return count(n) to view the query plan. For Neo4j <= 2.1.x, you will see a different query plan depending on the order of the specified shortcuts.

Starting with Neo4j 2.2 (the M03 milestone available at the time of writing), there is a cost-based Cypher optimizer. Cypher now knows the statistics of the node, and they are used to optimize the query.

As an example, I used the following instructions to create some test data:

 create (:A:B); with 1 as a foreach (x in range(0,1000000) | create (:A)); with 1 as a foreach (x in range(0,100) | create (:B));

Now we have 100 nodes B, nodes 1M A and 1 AB node. In 2.2, both statements:

 MATCH (n:B:A) return count(n) MATCH (n:A:B) return count(n)

leads to the exact same tariff plan (and therefore to the same execution speed):

 +------------------+---------------+------+--------+-------------+---------------+ | Operator | EstimatedRows | Rows | DbHits | Identifiers | Other | +------------------+---------------+------+--------+-------------+---------------+ | EagerAggregation | 3 | 1 | 0 | count(n) | | | Filter | 12 | 1 | 12 | n | hasLabel(n:A) | | NodeByLabelScan | 12 | 12 | 13 | n | :B | +------------------+---------------+------+--------+-------------+---------------+

Since there are only a few nodes with nodes B , it is cheaper to scan B and filter for A Smart Cypher, right? -)

Do time stamps make search effects look?

More articles: