How to improve performance for massive merge insert?

Question

How to improve performance for massive merge insert?

I am trying to insert data from my SQL db into Neo4J. I have a CSV file where each line generates 4-5 entities and some relationships between them. Entities can be duplicated between lines, and I want to create uniqueness.

What am I doing now:

create restrictions for each label to create uniqueness.
CSV iteration:
- start a transaction
- create merge statements for objects
- create merge operations for relationships
- make a transaction

I have bad results. Then I tried to complete the transaction every X rows (X was 100, 500, 1000 and 5000). This is better now, but I still have two problems:

he is slow. an average of about 1-1.5 seconds per 100 rows. (line = 4-5 entities and 4-5 relationships).
This is getting worse as I keep adding data. Usually I start at 400-500 ms per 100 lines, and after ~ 5000 lines I go for ~ 4-5 seconds per 100 lines.

From what I know, my restriction also creates an index for this field. This is the field that is used when I create a new node with MERGE. Any chance he is not using an index?

What is the best practice for increasing productivity? I saw BatchInserter but was not sure if I can use it with MERGE operations.

thank

+3

neo4j

Zach moshe Mar 2 '14 at 15:51

source share

No one has answered this question yet.

See similar questions:

4

create unique and slow merges

0

Neo4j: Best way to batch exchange nodes using Cypher?