在Lucene中,當我使用大寫字母對字段進行索引時,在搜索時找不到它們。下面是一些示例代碼:LOWERCASE_EXPANDED_TERMS在Lucene中做了什麼?
public static void main(String[] args) throws Exception{
//Create an index
Directory index=new RAMDirectory();
IndexWriter indexWriter = new IndexWriter(index, new IndexWriterConfig(new KeywordAnalyzer()));
//Add a document to the index
Document document=new Document();
document.add(new StringField("path","/home/user/file1", Field.Store.YES));
document.add(new StringField("id","file1", Field.Store.YES));
indexWriter.addDocument(document);
//Add a document to the index
document=new Document();
document.add(new StringField("path","/HOME/user/file2", Field.Store.YES));
document.add(new StringField("id","file2", Field.Store.YES));
indexWriter.addDocument(document);
//Close the index.
indexWriter.close();
//Create a query parser.
StandardQueryParser standardQueryParser=new StandardQueryParser(new KeywordAnalyzer());
StandardQueryConfigHandler config=(StandardQueryConfigHandler)standardQueryParser.getQueryConfigHandler();
config.set(StandardQueryConfigHandler.ConfigurationKeys.ANALYZER, new KeywordAnalyzer());
config.set(StandardQueryConfigHandler.ConfigurationKeys.ALLOW_LEADING_WILDCARD,true);
config.set(StandardQueryConfigHandler.ConfigurationKeys.LOWERCASE_EXPANDED_TERMS,true);
//Run a query
Query query=standardQueryParser.parse("path: \\/HOME*","path");
IndexSearcher indexSearcher=new IndexSearcher(DirectoryReader.open(index));
TopDocs topDocs=indexSearcher.search(query,Integer.MAX_VALUE);
//Iterate thru results
for (ScoreDoc scoreDoc:topDocs.scoreDocs){
String id=indexSearcher.doc(scoreDoc.doc).get("id");
System.out.println(id);
}
}
輸出:
file1
我希望這樣的事情:
file1 file2
如果我設置LOWERCASE_EXPANDED_TERMS爲false,結果是:
file2
The Lucene Documentation for LOWERCASE_EXPANDED_TERMS說: 「用於設置擴展詞語是否應該放在較低位置的鍵」。有人能澄清這到底意味着什麼嗎?爲什麼我的大寫字母值被忽略?我是否應該對每個值執行.toLowerCase()以使其可搜索?