I am prototyping a data authorization / protection scheme in Neo4j and I am having a strange problem with one of my requests. For the background, the concept is that a user trying to get from a may be if they have the correct access identifier. So, our edges have types that have access identifiers. I am testing this circuit by creating many nodes and connecting their pairs with different accesses. That is, I have many sets:
(a)-[:ACCESS_A]->(b)
With different appeals. I ask them:
{some query} with a match (a)-[:ACCESS_A|:ACCESS_B|<...>|:ACCESS_Z]->(b) return b
where the size of the list at the matching edge increases with the amount of access available to users.
All this works fine until the list gets access to 201. At this point, the db hits and the time spent WAY up are shown in the profile. In 200 relationship types, the profile shows 1051 dB, but 201 relationship types shows 31801. This is a 30x increase for another type! Time increases in a similar way. the transition from 199 to 200 only increases by about 50 strokes and that is due to an increase in the number of nodes.
After more detailed work, it seems that the round number 200 is more a red herring than a problem. Previously, my relationship types were 4 characters. When I changed them to 9 characters (adding "EDGE_" as a test), the problem started in 50 types - 50 has 36 hits, and 51 - 291 have a smaller jump, but significant compared to the previous increase in the same test.
There seems to be some relation-name relation to where the request falls, but I'm still researching.
Things that I tested and didn't find are of interest:
- general request length (string size): it fails with completely different request sizes with 4 and 9 character relationship types
- length of the list in the sentence [e: <...>] (line size). As above, it fails at very different sizes
- the number of nodes or edges in the graph
source share