I am trying to insert data from my SQL db into Neo4J. I have a CSV file where each line generates 4-5 entities and some relationships between them. Entities can be duplicated between lines, and I want to create uniqueness.
What am I doing now:
- create restrictions for each label to create uniqueness.
- CSV iteration:
- start a transaction
- create merge statements for objects
- create merge operations for relationships
- make a transaction
I have bad results. Then I tried to complete the transaction every X rows (X was 100, 500, 1000 and 5000). This is better now, but I still have two problems:
- he is slow. an average of about 1-1.5 seconds per 100 rows. (line = 4-5 entities and 4-5 relationships).
- This is getting worse as I keep adding data. Usually I start at 400-500 ms per 100 lines, and after ~ 5000 lines I go for ~ 4-5 seconds per 100 lines.
From what I know, my restriction also creates an index for this field. This is the field that is used when I create a new node with MERGE. Any chance he is not using an index?
What is the best practice for increasing productivity? I saw BatchInserter but was not sure if I can use it with MERGE operations.
thank
source
share