2015-10-13 26 views
0

我的表格有一個引用URL作爲文件的列,以及其他列。示例表如下,我試圖索引表和SOLR中的文件內容。這些文件可通過帶有'http://domain.com/'前綴的URL訪問,例如'http://domain.com/file/sample1.pdf'。我將無法以文件共享的方式訪問這些文件。Solr 5.3.1 - Db-import-hander - TikaEntityProcessor找不到我的http文件

Filepath    author Title 
file/sample1.pdf  Jack  title 1 
file/sample2.pdf  Bob  title 2 
file/sample3.docx  Tim  title 2 

我的數據庫數據導入XML是這樣的,

<dataConfig> 
    <dataSource name="dbrows" driver="oracle.jdbc.OracleDriver" 
       url="jdbc:oracle:thin:@..... 
       user="***" 
       password="***"/>  
    <dataSource type="BinFileDataSource" name="attachments" /> 

    <document> 
     <entity name="docs" dataSource="dbrows" query="select 'http://domain.com/'||filepath as PATH,author,title from dummytable" >   

     <entity name="file" 
       processor="TikaEntityProcessor" 
       url="${docs.PATH}" 
       dataSource="attachments" 
       format="text" 
       onError="continue" 
       transformer="script:processFile"> 
      <field column="text" name="text" /> 
      </entity> 
     </entity> 
    </document> 
</dataConfig> 

我得到的錯誤是,

2015-10-13 23:15:43.859 WARN (Thread-25) [ x:db] o.a.s.h.d.FileDataSource FileDataSource.basePath is empty. Resolving to: C:\Users\asdf\Downloads\Solr\solr-5.3.1\server\. 
2015-10-13 23:15:43.860 ERROR (Thread-25) [ x:db] o.a.s.h.d.EntityProcessorWrapper Exception in entity : file:java.lang.RuntimeException: java.io.FileNotFoundException: Could not find file: http://domain.com/file/sample1.pdf (resolved to: C:\Users\asdf\Downloads\Solr\solr-5.3.1\server\.\http://domain.com/file/sample1.pdf 
    at org.apache.solr.handler.dataimport.FileDataSource.getFile(FileDataSource.java:126) 
    at org.apache.solr.handler.dataimport.BinFileDataSource.getData(BinFileDataSource.java:51) 
    at org.apache.solr.handler.dataimport.BinFileDataSource.getData(BinFileDataSource.java:42) 
    at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:131) 
    at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243) 
    at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475) 
    at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:514) 
    at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414) 
    at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329) 
    at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232) 
    at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416) 
    at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480) 
    at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461) 
Caused by: java.io.FileNotFoundException: Could not find file: http://domain.com/file/sample1.pdf (resolved to: C:\Users\asdf\Downloads\Solr\solr-5.3.1\server\.\http://domain.com/file/sample1.pdf 
    at org.apache.solr.handler.dataimport.FileDataSource.getFile(FileDataSource.java:122) 
    ... 12 more 

2015-10-13 23:15:43.890 WARN (Thread-25) [ x:db] o.a.s.h.d.FileDataSource FileDataSource.basePath is empty. Resolving to: C:\Users\asdf\Downloads\Solr\solr-5.3.1\server\. 

這甚至可能嗎?任何幫助,高度讚賞。

回答

2

固定。二手BinURLDataSource代替BinFileDataSource

<dataSource type="BinFileDataSource" name="attachments" /> 

改變這

<dataSource type="BinURLDataSource" name="attachments" /> 
相關問題