I am using Cypher syntax LOAD CSVin Neo4J 2.1.2. So far, this has been a big improvement over the more manual ETL process required in previous versions. But I encounter some kind of behavior in one case, which is not what I expect, and I wonder if I am missing something.
The cypher request used is as follows:
USING PERIODIC COMMIT 500
LOAD CSV FROM 'file:///Users/James/Desktop/import/dependency_sets_short.csv' AS row
MATCH (s:Sense {uid: toInt(row[4])})
MERGE (ds:DependencySet {label: row[2]}) ON CREATE SET ds.optional=(row[3] = 't')
CREATE (s)-[:has]->(ds)
Here are a couple of CSV lines:
227303,1,TO-PURPOSE-NOMINAL,t,73830
334471,1,AT-LOCATION,t,92048
334470,1,AT-TIME,t,92048
334469,1,ON-LOCATION,t,92048
227302,1,TO-PURPOSE-INFINITIVE,t,73830
116008,1,TO-LOCATION,t,68204
116007,1,IN-LOCATION,t,68204
227301,1,TO-LOCATION,t,73830
334468,1,ON-DATE,t,92048
116006,1,AT-LOCATION,t,68204
334467,1,WITH-ASSOCIATE,t,92048
, Sense node ( ) ID, . , DependencySet node, , . , has Sense node DependencySet node. , , . , CSV .
CSV Lines Time (msec)
------------------------------
500 480
1000 717
2000 1110
5000 1521
10000 2111
50000 4794
100000 5907
200000 12302
300000 35494
400000 Java heap space error
, , 500 , manual, :

, - 300k 400k , Java- . , , 400 . . 5-7 , . , 300 000 , , " " ? , Neo4J , , , .
, , Sense.uid DependencySet.label , . :
Indexes
ON :DependencySet(label) ONLINE (for uniqueness constraint)
ON :Sense(uid) ONLINE (for uniqueness constraint)
.
: , -, MATCH / CREATE. 3 5 Cypher, .