I am trying to accomplish a fairly general task. I have a substantial data set in a Neo4J database, and from the RESTful web service I want to return the data to pieces from 25 nodes. My model is pretty simple:
(:Tenant {Hash:''})-[:owns]->(:Asset {Hash:'', Name:''})
I have unique property restrictions Hashon both labels.
If I wanted to get the 101st page of data, my cypher request would look like this:
MATCH (:Tenant {Hash:'foo'})-[:owns]->(a:Asset)
RETURN a
ORDER BY a.Hash
SKIP 2500
LIMIT 25
My data set consists of one tenant with assets of ~ 75 KB. The above request takes ~ 30 (!) Seconds to complete. I also notice that the further I advance in the data (i.e., Above SKIP), the longer it takes to return the request.
I quickly realized that the culprit in my performance problems was ORDER BY a.Hash. When I delete it, the query returns with the results of the subsection. This is actually quite unexpected, as I expect the index itself to be ordered as well.
Obviously, to implement reasonable pagination, I have to have a consistent sort order.
- Any tips on fulfilling this request are being implemented?
- Alternative swap suggestions? I see the addition of dedicated page nodes, but this will be difficult to maintain.
- What is the default sort order in any case, and is it sequential?
source
share