Lucene搜索兩個或更多單詞不能在Android上工作

我在Android上使用Lucene 3.6.2。使用的代碼和觀察結果如下。Lucene搜索兩個或更多單詞不能在Android上工作

索引代碼：

public void indexBookContent(Book book, File externalFilesDir) throws Exception { 
    IndexWriter indexWriter = null; 
    NIOFSDirectory directory = null; 

    directory = new NIOFSDirectory(new File(externalFilesDir.getPath() + "/IndexFile", book.getBookId())); 
    IndexWriterConfig indexWriterConfig = new IndexWriterConfig(LUCENE_36, new StandardAnalyzer(LUCENE_36)); 
    indexWriter = new IndexWriter(directory, indexWriterConfig); 

    Document document = createFieldsForContent(); 

    String pageContent = Html.fromHtml(decryptedPage).toString(); 
    ((Field) document.getFieldable("content")).setValue(pageContent); 
    ((Field) document.getFieldable("content")).setValue(pageContent); 
    ((Field) document.getFieldable("content")).setValue(pageContent.toLowerCase()); 
} 

private Document createFieldsForContent() { 
    Document document = new Document(); 

    Field contentFieldLower = new Field("content", "", YES, NOT_ANALYZED); 
    document.add(contentFieldLower); 
    Field contentField = new Field("content", "", YES, ANALYZED); 
    document.add(contentField); 
    Field contentFieldNotAnalysed = new Field("content", "", YES, NOT_ANALYZED); 
    document.add(contentFieldNotAnalysed); 
    Field recordIdField = new Field("recordId", "", YES, ANALYZED); 
    document.add(recordIdField); 
    return document; 
} 

public JSONArray searchBook(String bookId, String searchText, File externalFieldsDir, String filter) throws Exception { 
    List<SearchResultData> searchResults = null; 
    NIOFSDirectory directory = null; 
    IndexReader indexReader = null; 
    IndexSearcher indexSearcher = null; 

    directory = new NIOFSDirectory(new File(externalFieldsDir.getPath() + "/IndexFile", bookId)); 
    indexReader = IndexReader.open(directory); 
    indexSearcher = new IndexSearcher(indexReader); 

    Query finalQuery = constructSearchQuery(searchText, filter); 

    TopScoreDocCollector collector = TopScoreDocCollector.create(100, false); 
    indexSearcher.search(finalQuery, collector); 
    ScoreDoc[] scoreDocs = collector.topDocs().scoreDocs; 
} 

private Query constructSearchQuery(String searchText, String filter) throws ParseException { 
    QueryParser contentQueryParser = new QueryParser(LUCENE_36, "content", new StandardAnalyzer(LUCENE_36)); 
    contentQueryParser.setAllowLeadingWildcard(true); 
    contentQueryParser.setLowercaseExpandedTerms(false); 

    String wildCardSearchText = "*" + QueryParser.escape(searchText) + "*"; 

    // Query Parser used. 
    Query contentQuery = contentQueryParser.parse(wildCardSearchText); 
    return contentQueryParser.parse(wildCardSearchText); 
}

我已經經歷了這樣：「Lucene: Multi-word phrases as search terms」，和我的邏輯似乎並沒有不同。

我的疑問是字段被覆蓋。另外，我需要中文語言支持，它與這個代碼一起工作，除了兩個或更多字支持的問題。

來源

2014-06-20 Zooter

我似乎不明白你確切的問題是什麼。就像在你提到的那個鏈接中，當你輸入多個單詞時不會返回正確的結果。在哪個字段中搜索以及通過哪個查詢，舉例 – Eypros

讓我在這裏陳述我的觀察結果。單個單詞的搜索工作正常，單箇中文單詞和特殊字符也是如此。但是如果我搜索兩個詞，我沒有得到任何結果。我將更新上面的代碼來指定查詢詳細信息 – Zooter

一個說明，前面：

看到這樣的搜索實現立即似乎有點陌生。它看起來像一個過度複雜的方式來通過所有可用的字符串進行線性搜索。我不知道你需要完成什麼，但是我懷疑你會更好地服務於對文本進行適當的分析，而不是對關鍵字分析的文本做雙重通配符，這將會表現不佳，並且不會提供很大的靈活性搜索。

移動到更具體的問題：

要分析同場多次用不同的分析方法相同的內容。相反，如果你確實需要所有這些分析方法可用於搜索，你應該將它們編入不同的字段中。一起搜索沒有意義，所以他們不應該在同一個領域。

然後你有這種模式：

Field contentField = new Field("content", "", YES, ANALYZED); 
document.add(contentField); 
//Somewhat later 
((Field) document.getFieldable("content")).setValue(pageContent);

不這樣做，這是沒有意義的。只要將您的內容到構造函數，並把它添加到您的文檔：

Field contentField = new Field("content", pageContent, YES, ANALYZED); 
document.add(contentField);

尤其是如果你選擇繼續在同一領域的多個方面分析，有沒有辦法讓不同領域中的一個實現（getFieldable總是返回第一加一）

而這個查詢：

String wildCardSearchText = "*" + QueryParser.escape(searchText) + "*";

正如你提到的，不會有多個方面很好地工作。它運行QueryParser語法。你最終什麼是一樣的東西：*two terms*，這將被搜索：

field:*two field:terms*

，不會產生任何的比賽對你的關鍵字字段（大概）。 QueryParser根本不適合這種查詢。你需要在這裏自己構造一個通配符查詢：

WildcardQuery query = new WildcardQuery(new Term("field", "*two terms*"));

來源

2014-06-20 16:55:24 femtoRgon

感謝您的注意。我使用document.getFieldable的原因是我使用相同的方法爲「content」以外的項目創建了各種文檔。我現在糾正了。效果很好。謝謝。 – Zooter

Lucene搜索兩個或更多單詞不能在Android上工作

回答

相關問題