Neo4j: Best way to batch exchange nodes using Cypher?

When I run a script that tries to merge all nodes of certain types, I get some strange performance results.

When merging two sets of nodes (~ 42k) and (~ 26k), performance is good and fast. But when I merge (~ 42) and (5), DRAMATICALLY's performance degrades. I collect ParentNodes (so that (~ 42k) is broken down in batches of 500. Why does performance drop when I essentially merge fewer nodes (when the packet set is the same, but the source packet set is high and the target set is low)?

Relationship Request:

MATCH (s:ContactPlayer)   
WHERE  has(s.ContactPrefixTypeId)    
WITH  collect(s) AS allP   
WITH  allP[7000..7500] as rangedP   
FOREACH  (parent in rangedP  |  
    MERGE (child:ContactPrefixType 
            {ContactPrefixTypeId:parent.ContactPrefixTypeId}
          )  
    MERGE (child)-[r:CONTACTPLAYER]->(parent)  
    SET r.ContactPlayerId = parent.ContactPlayerId ,      
        r.ContactPrefixTypeId = child.ContactPrefixTypeId  )

Results:

Process start

Start inserting contacts [++++++++++++++++++++++++++++++++++++++++++++++++ +++ ++++++++++++++++++++++++++++++++ ++++++]


  • 42149 : 19176.87
  • (500): 213.4
  • : 663

ContactPlayer [+++++++++++++++++++++++++++++++++++++++++++++++++ +++++++]


  • 27970 ContactPlayer: 9419.2106ms
  • (500): 167.75
  • : 689ms

ContactPlayer [+++++++++++++++++++++++++++++++++++++++++++++++++ +++++++]


  • , ContactPlayer: 7907.4877ms
  • (500): 141.151517857143ms
  • : 883.0918 : 0

ContactPrefixType
[+]


  • 5 ContactPrefixType: 22.0737
  • (500): 22
  • : 22

.

ContactPrefixType [+++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++++++++++++++++++ ++++++]


  • , ContactPrefixType : 376540.8309ms
  • (500): 4429.78643647059ms
  • : 14263.1843 : 63
0
2

, ?

MATCH (s:ContactPlayer {ContactPrefixTypeId:{cptid})
MERGE (c:ContactPrefixType {ContactPrefixTypeId:{cptid})
MERGE c-[:CONTACT_PLAYER]->s

REST API Cypher, , :

{
    "query":...,
    "params": {
        "cptid":id1
    }
}

, . , . .

{
    "statements":[
        "statement":...,
        "parameters": {
            "cptid":id1
        },
        "statement":...,
        "parameters": {
            "cptid":id2
        }
    ]
}
0

, , ( , ):

/Else:

If childrenNodes.count() < 200 → , ... .. ContactPrefixType

, (.. ContactAddress)

If childNodes < 200

MATCH (parent:{parentLabel}), 
(child:{childLabel} {{childLabelIdProperty}:parent.{parentRelationProperty}})
CREATE child-[r:{relationshipLabel}]->parent

3-5

Else

MATCH (child:{childLabel}), 
(parent:{parentLabel} {{parentPropertyField : child.{childLabelIdProperty}})
WITH collect(parent) as parentCollection, child
WITH parentCollection[{batchStart}..{batchEnd}] as coll, child
FOREACH (parent in coll | 
CREATE child-[r:{relationshipLabel}]-parent )

, , , , .

:

  • 225,018 2,070,977 .
  • 464 606

: 331 .

, , , ... , (.. )

- , .

0

Source: https://habr.com/ru/post/1535453/


All Articles