2014-02-13 30 views
0

大家的Neo4j引發java.lang.OutOfMemory異常,當插入大數據

我PROGRAME提高時插入大量的數據,g雖然,我不得不使用一些d提示tunning例外java.lang.OutOfMemory作爲變革JAVA_OPTS和事務批處理我聽說JVM會減少內存使用量,因爲Neo4J提交它的事務,但它似乎不起作用。 當它處理700萬行異常引發,有什麼建議?

這是我的Neo4j性能

neostore.propertystore.db.index.keys.mapped_memory=20M 
neostore.propertystore.db.index.mapped_memory=20M 
neostore.nodestore.db.mapped_memory=400M 
neostore.relationshipstore.db.mapped_memory=1000M 
neostore.propertystore.db.mapped_memory=400M 
neostore.propertystore.db.strings.mapped_memory=400M 

這是我的JVM OPTS

java -jar -server -Xmx2G -XX:+UseConcMarkSweepGC neodataio.jar [email protected] 

這是我的代碼

public Node createNode(String type, String v) { 
stype = type; 
UniqueFactory.UniqueNodeFactory factory = new UniqueFactory.UniqueNodeFactory(
    db, type) { 
    @Override 
    protected void initialize(Node created, 
     Map<String, Object> properties) { 
    created.addLabel(DynamicLabel.label(stype)); 
    created.setProperty("v", properties.get(stype)); 
    } 

}; 
return factory.getOrCreate(type, v); 
} 

private void processLine(String line) { 
line = stripeStr(line); 
String[] fields = line.split("["+splitor+"]"); 
List<Node> row = new ArrayList<Node>(); 
Map<String,Boolean> unqi = new HashMap<String,Boolean>(); 
for (String field : fields) { 
    String[] kvs = field.split("["+kv+"]"); 
    if(kvs.length==2 
     &&!unqi.containsKey(kvs[1]) 
     &&!stripeStr(kvs[1]).equals("") 
     &&!stripeStr(kvs[1]).toLowerCase().equals("null")){ 
    Node n = createNode(stripeStr(kvs[0]), stripeStr(kvs[1])); 
    row.add(n); 
    unqi.put(kvs[1], true); 
    } 
} 
if (row.size() > 1) { 
    for (int i = 1; i < row.size(); i++) { 
    row.get(0).createRelationshipTo(row.get(i), Importer.connect); 
    } 
} 
} 

private void processBatch(ArrayList<String> batch){ 
Transaction tx = db.beginTx(); 
try { 
    for(String line : batch) {   
     processLine(line);   
    }  
    tx.success(); 
} finally { 
    tx.close(); 
} 
} 

private String stripeStr(String str){ 
return str.trim().replace("\n", "").replace("\t", ""); 
} 

public void processFile(String filepth) throws IOException { 
long begin = new Date().getTime(); 
File f = new File(filepth); 
FileInputStream fi = new FileInputStream(f); 
BufferedReader dr=new BufferedReader(new InputStreamReader(fi)); 
String line; 
int i = 1; 
ArrayList<String> batch = new ArrayList<String>(); 
while((line=dr.readLine())!=null){ 
    batch.add(line); 
    if(i%batchsize == 0){ 
    processBatch(batch); 
    batch = new ArrayList<String>(); 
    System.out.println(i); 
    } 
    i++; 
} 
processBatch(batch); 
System.out.println(i); 
long end = new Date().getTime(); 
System.out.println("cost time:"+(end-begin)); 
} 

異常

Exception in thread "GC-Monitor" java.lang.OutOfMemoryError: Java heap space 
    at java.util.Arrays.copyOfRange(Arrays.java:2694) 
    at java.lang.String.<init>(String.java:203) 
    at java.lang.StringBuilder.toString(StringBuilder.java:405) 
    at org.neo4j.kernel.impl.cache.MeasureDoNothing.run(MeasureDoNothing.java:84) 
Exception in thread "main" org.neo4j.graphdb.TransactionFailureException: Unable to commit transaction 
    at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:140) 
    at com.bfd.finance.neo4j.dataio.Importer.processBatch(Importer.java:79) 
    at com.bfd.finance.neo4j.dataio.Importer.processFile(Importer.java:98) 
    at com.bfd.finance.neo4j.dataio.Importer.main(Importer.java:161) 
Caused by: org.neo4j.graphdb.TransactionFailureException: commit threw exception 
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:498) 
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:397) 
    at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:122) 
    at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:124) 
    ... 3 more 
Caused by: javax.transaction.xa.XAException 
    at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:553) 
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:460) 
    ... 6 more 
Caused by: java.lang.OutOfMemoryError: Java heap space 
    at java.util.HashMap.createEntry(HashMap.java:901) 
    at java.util.HashMap.putForCreate(HashMap.java:554) 
    at java.util.HashMap.putAllForCreate(HashMap.java:559) 
    at java.util.HashMap.<init>(HashMap.java:298) 
    at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.applyCommit(WriteTransaction.java:817) 
    at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.doCommit(WriteTransaction.java:751) 
    at org.neo4j.kernel.impl.transaction.xaframework.XaTransaction.commit(XaTransaction.java:322) 
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commitWriteTx(XaResourceManager.java:530) 
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commit(XaResourceManager.java:446) 
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.commit(XaResourceHelpImpl.java:64) 
    at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:545) 
    ... 7 more 
+0

你可能想看看批量插入 - http://docs.neo4j.org/chunked/milestone/batchinsert.html。同時考慮定期提交您的交易,例如每100個插入。 – tstorms

+0

我已經在批處理插入研究,但它不能提供'獲取或創建'的方式。而且我已經在我的代碼中使用批量提交。 – vshanyiao

+1

你的批量是多少?難道是你創造了一個外部交易?所以你在這裏創建的所有tx只是嵌套tx? –

回答

1

我們所做的是每5000個節點提交一次交易,並且完美地工作。最明顯的缺點是,你不能回退第一個節點5000時沒有與節點5001

至於batchinserter問題。如果您使用您的程序導入一次數據而不需要數據庫可用於其他請求,則可以使用它。對於所有其他大的導入用例,batchinserter不會幫助你。

+0

你能張貼您的JVM選擇採用和Neo4j的屬性呢?我想有一些問題,我的 – vshanyiao

+0

我們正在使用的Neo4j的默認設置。對於JVM,我們必須添加額外的內存:-Xmx512M -XX:MaxPermSize = 128M – Wouter