2016-02-03 83 views
1

我試圖用cassandra運行nutch 2.3.1。遵循http://wiki.apache.org/nutch/Nutch2Cassandra的步驟。最後,當我嘗試使用命令啓動Nutch的:Nutch 2.3.1上cassandra無法啓動

bin/crawl urls/ test http://localhost:8983/solr/ 2 

我有以下異常:

GeneratorJob: starting 
GeneratorJob: filtering: false 
GeneratorJob: normalizing: false 
GeneratorJob: topN: 50000 
GeneratorJob: java.lang.RuntimeException: job failed: name=[test]generate: 1454483370-31180, jobid=job_local1380148534_0001 
    at  org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120) 
    at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:227) 
    at org.apache.nutch.crawl.GeneratorJob.generate(GeneratorJob.java:256) 
    at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:322) 
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) 
    at org.apache.nutch.crawl.GeneratorJob.main(GeneratorJob.java:330) 

Error running: 
    /home/user/apache-nutch-2.3.1/runtime/local/bin/nutch generate -D mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true -topN 50000 -noNorm -noFilter -adddays 0 - crawlId webmd -batchId 1454483370-31180 
Failed with exit value 255. 

當我檢查日誌/ hadoop.log,這裏的錯誤消息:

2016-02-03 15:18:14,741 ERROR connection.HConnectionManager - Could not start connection pool for host localhost(127.0.0.1):9160 
... 
2016-02-03 15:18:15,185 ERROR store.CassandraStore - All host pools marked down. Retry burden pushed out to client. 
me.prettyprint.hector.api.exceptions.HectorException: All host pools marked down. Retry burden pushed out to client. 
    at me.prettyprint.cassandra.connection.HConnectionManager.getClientFromLBPolicy(HConnectionManager.java:390) 

但是我的cassandra服務器已啓動:

runtime/local$ netstat -l |grep 9160 
tcp  0  0 172.16.230.130:9160  *:*      LISTEN 

任何人都可以幫助解決這個問題?謝謝。

回答

2

卡桑德拉的地址不是localhost,它是172.16.230.130。這就是原因,Nutch無法連接到Cassandra商店。

希望這有助於

李全安待辦事項