我正在Ubuntu Linux上使用Python 3.2的py2neo將SQLite3數據庫中的圖形填充到neo4j中。儘管速度並不是最關心的問題,但圖表在大約3小時內只獲得了40K行(每個sql行有一個關係),總數爲500萬行。使用Cypher加速py2neo
這裏是主循環:
from py2neo import neo4j as neo
import sqlite3 as sql
#select all 5M rows from sql-database
sql_str = """select * from bigram_with_number"""
#loop through each row
for (freq, first, firstfreq, second, secondfreq) in sql_cursor.execute(sql_str):
# create the Cypher query string using cypher 2.0 with merge
# so that nodes are created only if needed
query = neo.CypherQuery(neo4j_db,"""
CYPHER 2.0
merge (n:word {form: {firstvar}, freq: {freqfirst}})
merge(m:word {form: {secondvar}, freq: {freqsecond}})
create unique (n)-[:bigram {freq: {freqbigram}}]->(m) return n, m""")
#execute the string with parameters from sql-query
result = query.execute(freqbigram = freq, firstvar = first, freqfirst=firstfreq, secondvar=second, freqsecond=secondfreq)
雖然數據庫填充好聽,它完成前,將需要數週時間。 我懷疑可以更快地做到這一點。
我將Cypher查詢創建移出循環,並刪除了其中的return n,m'語句,從而使速度提高了5倍。但它仍然太慢。 –