我索引一個用戶必須能夠搜索的大數據庫概述(只是文本字段)(在indexFields方法中)。此前的搜索是在ILIKE查詢的數據庫中完成的,但速度很慢,所以現在搜索是在索引上完成的。 Hovewer,當我比較db查詢的搜索結果,以及索引搜索得到的結果時,從索引搜索的結果總是少得多。 林不知道如果我在索引或搜索過程中犯了錯誤。對我來說,這一切似乎都有道理。有任何想法嗎?在lucene索引搜索中缺少匹配
這是代碼。所有建議讚賞!
// INDEXING
StandardAnalyzer analyzer = new StandardAnalyzer(
Version.LUCENE_CURRENT, stopSet); // stop set is empty
IndexWriter writer = new IndexWriter(INDEX_DIR, analyzer, true,
IndexWriter.MaxFieldLength.UNLIMITED);
indexFields(writer);
writer.optimize();
writer.commit();
writer.close();
analyzer.close();
private void indexFields(IndexWriter writer) {
DetachedCriteria criteria = DetachedCriteria
.forClass(Activit.class);
int count = 0;
int max = 50000;
boolean existMoreToIndex = true;
List<Activit> result = new ArrayList<Activit>();
while (existMoreToIndex) {
try {
result = activitService.listPaged(count, max);
if (result.size() < max)
existMoreToIndex = false;
if (result.size() == 0)
return;
for (Activit ao : result) {
Document doc = new Document();
doc.add(new Field("id", String.valueOf(ao.getId()),
Field.Store.YES, Field.Index.ANALYZED));
if(ao.getActivitOwner()!=null)
doc.add(new Field("field1", ao.getActivityOwner(),Field.Store.YES, Field.Index.ANALYZED));
if(ao.getActivitResponsible() != null)
doc.add(new Field("field2", ao.getActivityResponsible(), Field.Store.YES,Field.Index.ANALYZED));
try {
writer.addDocument(doc);
} catch (CorruptIndexException e) {
e.printStackTrace();
}
count += max;
//SEARCH
public List<Activit> searchActivitiesInIndex(String searchCriteria) {
Set<String> stopSet = new HashSet<String>(); // empty because we do not want to remove stop words
Version version = Version.LUCENE_CURRENT;
String[] fields = {
"field1", "field2"};
try {
File tempFile = new File("C://testindex");
Directory INDEX_DIR = new SimpleFSDirectory(tempFile);
Searcher searcher = new IndexSearcher(INDEX_DIR, true);
QueryParser parser = new MultiFieldQueryParser(version, fields, new StandardAnalyzer(
version, stopSet));
Query query = parser.parse(searchCriteria);
TopDocs topDocs = searcher.search(query, 500);
ScoreDoc[] hits = topDocs.scoreDocs;
//here i always get smaller hits lenght
searcher.close();
} catch (Exception e) {
e.printStackTrace();
}
}
打印topDocs.totalHits,如果你還沒有做。該號碼將爲您提供與您的查詢匹配的文件總數。 – 2011-06-07 08:13:27
@Shashikant Kore:我已經這麼做了,看到這個數字是對的,這就是爲什麼我發佈了這個問題。 – Julia 2011-06-07 11:26:48