2011-05-05 46 views
2

我使用Solr 3.1,Apache Tika 0.9和Solrnet 0.3.1來索引 docuent,如.doc和.pdf文件。不索引或提取遠程文檔(.pdf .doc)

我已經成功地索引和提取文件在本地使用此 代碼

Startup.Init<Article>("http://k9server:8080/solr"); 
     ISolrOperations<Article> solr = ServiceLocator.Current.GetInstance <ISolrOperations<Article>>(); 
     string filecontent = null; 
     using(var file = File.OpenRead(@"D:\\solr.doc")){ 
        var response = solr.Extract(new ExtractParameters(file, "abcd1") { 
         ExtractOnly = true, 
         ExtractFormat = ExtractFormat.Text, 
      }); 
      filecontent = response.Content; 
     } 
     solr.Add(new Article() { 
       ID = "36", 
       EMAIL = "1234", 
       COMMENTS = filecontent, 
       PRO_ID = 256 
     }); 
     // commit to the index 
     solr.Commit(); 

但我面臨的問題從遠程使用相同的代碼提取或索引文件,我得到了錯誤:

The remote server returned an error: (500) Internal Server Error. 
SolrNet.Exceptions.SolrConnectionException was unhandled 

消息

Apache Tomcat/6.0.32 - Error report HTTP Status 500 - org.apache.poi.poifs.filesystem.POIFSFileSystem.getRoot()Lorg/apache/poi/poifs/filesystem/DirectoryNode; 

java.lang.NoSuchMethodError: org.apache.poi.poifs.filesystem.POIFSFileSystem.getRoot()Lorg/apache/poi/poifs/filesystem/DirectoryNode; 
    at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:65) 
    at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:57) 
    at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:164) 
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197) 
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197) 
    at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135) 
    at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:196) 
    at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:55) 
    at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) 
    at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:238) 
    at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360) 
    at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) 
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) 
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) 
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) 
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) 
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) 
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) 
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) 
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) 
    at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:864) 
    at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:579) 
    at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1665) 
    at java.lang.Thread.run(Unknown Source) 

消息

org.apache.poi.poifs.filesystem.POIFSFileSystem.getRoot()Lorg/apache/poi/poifs/filesystem/DirectoryNode;  
    java.lang.NoSuchMethodError: org.apache.poi.poifs.filesystem.POIFSFileSystem.getRoot()Lorg/apache/poi/poifs/filesystem/DirectoryNode; 
      at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:65) 
      at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:57) 
      at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:164) 
      at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197) 
      at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197) 
      at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135) 
      at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:196) 
      at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:55) 
      at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) 
      at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:238) 
      at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360) 
      at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) 
      at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) 
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) 
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) 
      at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) 
      at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) 
      at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) 
      at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 
      at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) 
      at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) 
      at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:864) 
      at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:579) 
      at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1665) 
      at java.lang.Thread.run(Unknown Source) 

說明

The server encountered an internal error (org.apache.poi.poifs.filesystem.POIFSFileSystem.getRoot()Lorg/apache/poi/poifs/filesystem/DirectoryNode; 

java.lang.NoSuchMethodError: org.apache.poi.poifs.filesystem.POIFSFileSystem.getRoot()Lorg/apache/poi/poifs/filesystem/DirectoryNode; 
    at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:65) 
    at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:57) 
    at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:164) 
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197) 
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197) 
    at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135) 
    at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:196) 
    at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:55) 
    at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) 
    at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:238) 
    at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360) 
    at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) 
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) 
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) 
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) 
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) 
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) 
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) 
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) 
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) 
    at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:864) 
    at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:579) 
    at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1665) 
    at java.lang.Thread.run(Unknown Source) 
) that prevented it from fulfilling this request. 
    Source=SolrNet 
    StackTrace: 
     at SolrNet.Impl.SolrConnection.PostStream(String relativeUrl, String contentType, Stream content, IEnumerable`1 parameters) 
     at SolrNet.Commands.ExtractCommand.Execute(ISolrConnection connection) 
     at SolrNet.Impl.SolrBasicServer`1.Send(ISolrCommand cmd) 
     at SolrNet.Impl.SolrBasicServer`1.SendAndParseExtract(ISolrCommand cmd) 
     at SolrNet.Impl.SolrBasicServer`1.Extract(ExtractParameters parameters) 
     at SolrNet.Impl.SolrServer`1.Extract(ExtractParameters parameters) 
     at SolrNetSample.Program.Main(String[] args) in E:\TestProject\SolrNetSample\SolrNetSample\SolrNetSample\Program.cs:line 38 
     at System.AppDomain._nExecuteAssembly(Assembly assembly, String[] args) 
     at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args) 
     at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly() 
     at System.Threading.ThreadHelper.ThreadStart_Context(Object state) 
     at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) 
     at System.Threading.ThreadHelper.ThreadStart() 
    InnerException: System.Net.WebException 
     Message=The remote server returned an error: (500) Internal Server Error. 
     Source=System 
     StackTrace: 
      at System.Net.HttpWebRequest.GetResponse() 
      at HttpWebAdapters.Adapters.HttpWebRequestAdapter.GetResponse() 
      at SolrNet.Impl.SolrConnection.GetResponse(IHttpWebRequest request) 
      at SolrNet.Impl.SolrConnection.PostStream(String relativeUrl, String contentType, Stream content, IEnumerable`1 parameters) 
+0

你可以從遠程服務器發佈錯誤日誌嗎?這可能會給出一些提示,爲什麼它會生成500. – Gagravarr 2011-05-05 11:30:26

+0

我發佈了錯誤日誌只是檢查它 – Dhaval950 2011-05-05 12:53:35

回答

1

您的遠程服務器上的類路徑中的兩個不同版本的Apache POI的,這就是爲什麼你得到的例外,你看

您應該刪除POI的舊版本,並留下只是用附帶的新罐子SOLR/Tika。如果找不到它,請參閱POI FAQ以瞭解如何識別額外的jar。

1

如果它工作對當地的Solr實例,但不反對另一個實例,那麼其他實例可能未正確配置。

通過堆棧跟蹤判斷,似乎POI庫不正確(可能是錯誤的版本)。確保從Solr 3.1.0發行版中複製所有的Tika JAR。