2016-09-11 44 views
0

我正在嘗試使用Lucene 6.2爲MySQL中的數據(在Scala中使用Slick)編制索引。這裏是下面的代碼無法在Lucene 6.2中使用Scala進行搜索

package oc.api.services 

/** 
    * Created by sujit on 9/7/16. 
    */ 
import org.apache.lucene.document._ 
import org.apache.lucene.analysis.standard.StandardAnalyzer 
import org.apache.lucene.index._ 
import org.apache.lucene.search.IndexSearcher 
import java.io.{File, IOException} 
import java.nio.file.Paths 

import akka.actor.ActorSystem 
import akka.event.{Logging, LoggingAdapter} 
import akka.stream.ActorMaterializer 
import oc.api.utils.{Config, DatabaseService} 
import org.apache.lucene.analysis.core.KeywordAnalyzer 
import org.apache.lucene.index.IndexWriterConfig.OpenMode 
import org.apache.lucene.queryparser.classic.{QueryParser} 
import org.apache.lucene.store.FSDirectory 

import scala.concurrent.ExecutionContext 

class Indexer extends Config { 
    implicit val actorSystem = ActorSystem() 
    implicit val executor: ExecutionContext = actorSystem.dispatcher 
    implicit val log: LoggingAdapter = Logging(actorSystem, getClass) 
    implicit val materializer: ActorMaterializer = ActorMaterializer() 

    val databaseService = new DatabaseService(jdbcUrl, dbUser, dbPassword) 

    val notesService = new NotesService(databaseService)  

    def setIndex = { 
    val IndexStoreDir = Paths.get("/var/www/html/LuceneIndex") 
    val analyzer = new KeywordAnalyzer() 
    val writerConfig = new IndexWriterConfig(analyzer) 
    writerConfig.setOpenMode(OpenMode.CREATE) 
    writerConfig.setRAMBufferSizeMB(500) 
    val directory = FSDirectory.open(IndexStoreDir) 
    var writer = new IndexWriter(directory, writerConfig) 
    val notes = notesService.getNotes() //Gets all notes from slick. Data is coming in getNotes() 
    var doc = new Document() 
    var count = 0 

    val stringType = new FieldType() 
    notes.map(_.foreach{ 
     case(note) => 
     doc = new Document() 

     var field = new TextField("title", note.title, Field.Store.YES) 
     doc.add(field) 

     field = new TextField("teaser", note.teaser, Field.Store.YES) 
     doc.add(field) 

     field = new TextField("description", note.description, Field.Store.YES) 
     doc.add(field) 

     writer.addDocument(doc) 
    }) 
    writer.commit() 
    } 

    def search(keyword: String) = { 
    val IndexStoreDir = Paths.get("/var/www/html/LuceneIndex") 
    var directoryReader = DirectoryReader.open(FSDirectory.open(IndexStoreDir)) 
    val analyzer = new StandardAnalyzer() 

    val searcher = new IndexSearcher(directoryReader) 

    val mqp = new QueryParser("title", analyzer) //MultiFieldQueryParser(filesToSearch,analyzer) 
    val query = mqp.parse(keyword) 

    val hits = searcher.search(query,10) 
    val scoreDoc = hits.scoreDocs 
    println(scoreDoc.length) 
    } 

} 

object Indexer extends App { 
    val index = new Indexer 
    index.setIndex 
    index.search("Donec") 
} 

setIndex函數在提供的路徑中按預期工作。但是當我根據關鍵字搜索索引時,它會拋出0結果。搜索功能有任何錯誤嗎?這怎麼解決?

回答

2

這裏的主要原因可能是您的分析儀不匹配。您使用KeywordAnalyzer進行索引編制,根本不分析。對於搜索,您使用StandardAnalyzer。在您的示例中,查詢"Donec"將被解析並分析爲title:donec,就好像您; d使用的是new TermQuery(new Term("title", "donec"))。由於您在索引時間使用了關鍵字分析器,因此這隻會匹配確切標題爲donec的文檔。您應該嘗試使用相同的分析器進行索引編制。

另一件事可能是 - 我只能猜測 - notesService.getNotes()可能是Future[_](或類似的異步類型),因爲它涉及光滑。如果是,則將呼叫中的所有文檔添加到.map(),計劃在未來解決後進行。但是,writer.commit()調用發生在調用線程中,可能在添加完所有文檔之前,因此您應該將提交移動到map回調中。

+0

第一件事,我在使用KeywordAnalyzer時犯了錯誤。其次,我在Future map中使用writer.commit。這給了我結果。謝謝你的幫助。我用正確的代碼編輯了問題。 – Sujit

+0

如何使用Thread來實現上述代碼以優化索引過程? – Sujit

+0

@Sujit我建議爲此打開一個新問題。 – knutwalker