我將第二分割這一個,進一次創建節點和創建關係(每個):
USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM "file:///Users/btibert/Dropbox/Projects/bentley-search-neo4j/data/templates.csv" AS row
WITH row
MATCH (r:Vendor {name:row.vendor})
WITH row, r
MERGE (p:Template {name:row.template_clean})
MERGE (v:Version {version:row.template_ver})
MERGE (p)-[:FROM_VERSION]->(v)
MERGE (p)-[:CREATED_BY]->(r);
正如你可以清楚地看到在計劃中渴望操作。
我的意思是,如果你只有幾千行這並不重要。但是,如果它走向數十萬甚至數百萬,那麼拉動所有數據需要更多的內存。如果你有每個學生可以使用WITH DISTINCT toInt(row.pidm) as pidm, ....
數量減少的合併它運行多行同時 :
+----------------+------------------------------------+------------------------------------------------------------------------------------------------+
| Operator | Identifiers | Other |
+----------------+------------------------------------+------------------------------------------------------------------------------------------------+
| EmptyResult | | |
| UpdateGraph(0) | anon[270], anon[301], p, r, row, v | MergePattern |
| UpdateGraph(1) | anon[270], p, r, row, v | MergePattern |
| UpdateGraph(2) | p, r, row, v | MergeNode; row.template_clean; :Template(name); MergeNode; row.template_ver; :Version(version) |
| Eager | r, row | |
| SchemaIndex | r, row | row.vendor; :Vendor(name) |
| LoadCSV | row | |
+----------------+------------------------------------+------------------------------------------------------------------------------------------------+
我可能會改變這種爲ON CREATE SET
變種對於非關鍵屬性。
LOAD CSV WITH HEADERS FROM "recs.csv" AS row
WITH row
MERGE (s:Student {pidm:toInt(row.pidm)})
ON CREATE SET s.hash_pidm=toInt(row.hash_pidm), ....;
這一次我會分成兩個語句,每一個關係,否則你可能會得到太多的比賽: (你不需要WITH
S IN之間)
LOAD CSV WITH HEADERS FROM "...recs.csv" AS row
WITH row
MATCH (s:Student {pidm: toInt(row.pidm)})
MATCH (v:Vendor {name: row.vendor})
MATCH (a:Ability {name: row.ability})
WITH row, s, v, a
MERGE (s)-[:PURCHASED_FROM]->(v)
MERGE (s)-[:HAS_ABILITY]->(a);
將成爲:
LOAD CSV WITH HEADERS FROM "...recs.csv" AS row
MATCH (s:Student {pidm: toInt(row.pidm)})
MATCH (v:Vendor {name: row.vendor})
MERGE (s)-[:PURCHASED_FROM]->(v);
LOAD CSV WITH HEADERS FROM "...recs.csv" AS row
MATCH (s:Student {pidm: toInt(row.pidm)})
MATCH (a:Ability {name: row.ability})
MERGE (s)-[:HAS_ABILITY]->(a);
這裏我還要對自己創建的聯繫人信息。 (再次上創建SET) ,做在一個單獨的語句中的師生關係:
LOAD CSV WITH HEADERS FROM "....cont.csv" AS row
MERGE (c:Contact {cid:row.cid}) ON CREATE SET ....;
LOAD CSV WITH HEADERS FROM "...cont.csv" AS row
MATCH (s:Student {pidm:toInt(row.pidm)})
MATCH (c:Contact {cid:row.cid})
MERGE (s)-[:HAS_CONTACT]->(c);
我還要拆這一個到兩個語句:
LOAD CSV WITH HEADERS FROM "...cont.csv" AS row
WITH row WHERE toInt(row.seqnum) = 1
MATCH (s:Student {pidm:toInt(row.pidm)})
MATCH (f:Contact {cid:row.first_cont})
MERGE (s)-[:FIRST]->(f);
LOAD CSV WITH HEADERS FROM "...cont.csv" AS row
WITH row WHERE toInt(row.seqnum) = 1
MATCH (s:Student {pidm:toInt(row.pidm)})
MATCH (l:Contact {cid:row.last_cont})
MERGE (s)-[:LAST]->(l);
斯普利特這一個進入E-郵件創建和後來通過MSG-ID將其連接到學生:
LOAD CSV WITH HEADERS FROM "...brm.csv" AS row
MERGE (e:Email {msgid:row.msgid}) ON CREATE SET ... ;
LOAD CSV WITH HEADERS FROM "file:///Users/btibert/Dropbox/Projects/bentley-search-neo4j/data/brm.csv" AS row
MATCH (s:Student {pidm:toInt(row.pidm)})
MATCH (e:Email {msgid:row.msgid})
MERGE (s)-[:WAS_SENT]->(e);
HTH邁克爾
在您最近的加載CSV語句中,您使用'Email:cid'作爲屬性,而您的約束位於'電子郵件:msgid' –
好,趕快看看結果如何。 – Btibert3
這次,當我嘗試從cypher.cql文件的第111行開始執行命令時,瀏覽器中出現「數據庫斷開連接」警告。從命令行運行該文件是否會提供任何性能改進? – Btibert3