如何使用lucene搜索文件

我想在文件「fdictionary.txt」中搜索包含逐行寫入的單詞列表（230,000字）的查詢。任何建議爲什麼這段代碼不起作用？拼寫檢查部分正在工作，給了我建議的列表（我將列表的長度限制爲1）。我想要做的就是搜索那個fdictionary，如果這個單詞已經在那裏，不要叫拼寫檢查。我的搜索功能不起作用。它不會給我錯誤！這裏是我已經實現：如何使用lucene搜索文件

public class SpellCorrection { 

public static File indexDir = new File("/../idxDir"); 

public static void main(String[] args) throws IOException, FileNotFoundException, CorruptIndexException, ParseException { 

    Directory directory = FSDirectory.open(indexDir); 
    SpellChecker spell = new SpellChecker(directory); 

    IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_20, null); 
    File dictionary = new File("/../fdictionary00.txt"); 
    spell.indexDictionary(new PlainTextDictionary(dictionary), config, true); 


    String query = "red"; //kne, console 
    String correctedQuery = query; //kne, console 

    if (!search(directory, query)) { 
     String[] suggestions = spell.suggestSimilar(query, 1); 
     if (suggestions != null) {correctedQuery=suggestions[0];} 
    } 

    System.out.println("The Query was: "+query); 
    System.out.println("The Corrected Query is: "+correctedQuery); 
} 

public static boolean search(Directory directory, String queryTerm) throws FileNotFoundException, CorruptIndexException, IOException, ParseException { 
    boolean isIn = false; 

    IndexReader indexReader = IndexReader.open(directory); 
    IndexSearcher indexSearcher = new IndexSearcher(indexReader); 
    Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_20); 

    Term term = new Term(queryTerm); 
    Query termQuery = new TermQuery(term); 
    TopDocs hits = indexSearcher.search(termQuery, 100); 
    System.out.println(hits.totalHits); 


    if (hits.totalHits > 0) { 
     isIn = true; 
    } 
    return isIn; 
} 
}

來源

2012-01-22 Marcus

我相信你的問題已經回答。接受其中一個答案 – naresh

你在哪裏索引fdictionary00.txt的內容？

只有當您有索引時，才能使用IndexSearcher進行搜索。如果您對lucene不熟悉，則可能需要查看一些快速教程。（如http://lucenetutorial.com/lucene-in-5-minutes.html）

來源

2012-01-22 07:46:58 naresh

這裏：spell.indexDictionary（new PlainTextDictionary（dictionary），config，true）; – Marcus

您需要爲搜索數據編制索引。檢查我給出的鏈接 – naresh

您從未建立過索引。

您需要設置索引...

Directory directory = FSDirectory.open(indexDir); 
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_20); 
IndexWriter writer = new IndexWriter(directory,analyzer,true,IndexWriter.MaxFieldLength.UNLIMITED);

然後，您需要創建一個文件，每個術語添加到文檔作爲分析的領域..

Document doc = new Document(); 
doc.Add(new Field("name", word , Field.Store.YES, Field.Index.ANALYZED));

然後添加文件索引

writer.AddDocument(doc); 

writer.Optimize();

現在構建索引並關閉索引編寫器。

writer.Commit(); 
writer.Close();

來源

2012-01-23 02:06:34 SharpBarb

你可以做一個服務提供您SpellChecker實例並使用spellChecker.exist(word)。

請注意，SpellChecker將不會索引2個字符或更少的字。爲了解決這個問題，您可以在創建它之後將它們添加到索引（將它們添加到SpellChecker.F_WORD字段中）。

如果您想要添加到您的實時索引並使其可用於exist(word)那麼您需要將它們添加到SpellChecker.F_WORD字段。當然，因爲你沒有添加到所有其他領域，如克/開始/結束等，那麼你的單詞不會出現作爲其他拼寫錯誤的單詞的建議。

在這種情況下，您必須將文字添加到文件中，因此當您重新創建索引時，它將作爲建議提供。如果該項目使公衆/受保護而不是私人項目成爲SpellChecker.createDocument(...)將是非常好的，因爲此方法通過添加單詞來完成所有任務。

畢竟這需要撥打spellChecker.setSpellIndex(directory)。

來源

2013-10-01 03:53:54 Hoox

如何使用lucene搜索文件

回答

相關問題