2
我跟着https://wiki.apache.org/nutch/NutchTutorial並試圖安裝和集成Nutch 1.12與Solr 5.5.2。我按照教程中提到的步驟安裝了Nutch,但是嘗試通過運行下面的命令與solr集成。它拋出了下面的例外。Nutch 1.12 exception java.io.IOException:No FileSystem for scheme:http
倉/ Nutch的索引http://10.209.18.213:8983/solr爬行/ crawldb/-linkdb爬行/ linkdb /爬行/分段/ * -filter -normalize
Exception
2016-08-11 09:18:40,076 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-08-11 09:18:40,383 WARN segment.SegmentChecker - The input path at crawldb is not a segment... skipping
2016-08-11 09:18:40,397 INFO segment.SegmentChecker - Segment dir is complete: crawl/segments/20160810110110.
2016-08-11 09:18:40,403 INFO segment.SegmentChecker - Segment dir is complete: crawl/segments/20160810112551.
2016-08-11 09:18:40,408 INFO segment.SegmentChecker - Segment dir is complete: crawl/segments/20160810112952.
2016-08-11 09:18:40,409 INFO indexer.IndexingJob - Indexer: starting at 2016-08-11 09:18:40
2016-08-11 09:18:40,415 INFO indexer.IndexingJob - Indexer: deleting gone documents: false
2016-08-11 09:18:40,415 INFO indexer.IndexingJob - Indexer: URL filtering: true
2016-08-11 09:18:40,415 INFO indexer.IndexingJob - Indexer: URL normalizing: true
2016-08-11 09:18:40,672 INFO indexer.IndexWriters - Adding org.apache.nutch.indexwriter.solr.SolrIndexWriter
2016-08-11 09:18:40,672 INFO indexer.IndexingJob - Active IndexWriters :
SOLRIndexWriter
solr.server.url : URL of the SOLR instance
solr.zookeeper.hosts : URL of the Zookeeper quorum
solr.commit.size : buffer size when sending to SOLR (default 1000)
solr.mapping.file : name of the mapping file for fields (default solrindex-mapping.xml)
solr.auth : use authentication (default false)
solr.auth.username : username for authentication
solr.auth.password : password for authentication
2016-08-11 09:18:40,677 INFO indexer.IndexerMapReduce - IndexerMapReduce: crawldb: http://10.209.18.213:8983/solr
2016-08-11 09:18:40,677 INFO indexer.IndexerMapReduce - IndexerMapReduce: linkdb: crawl/linkdb
2016-08-11 09:18:40,677 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: crawl/segments/20160810110110
2016-08-11 09:18:40,683 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: crawl/segments/20160810112551
2016-08-11 09:18:40,684 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: crawl/segments/20160810112952
2016-08-11 09:18:41,362 ERROR indexer.IndexingJob - Indexer: java.io.IOException: No FileSystem for scheme: http
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2385)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:256)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:304)
at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:512)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:833)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237)
我有同樣的問題。你有沒有找到解決辦法? – LucaoA