我是Lucene的新人。我在使用Lucene-3.6.0.jar的java中使用Lucene。我遵循http://www.tutorialspoint.com/lucene/的教程。我的基本代碼如下:在Lucene中添加BM25得分
public class LuceneTester {
String indexDir = "Data/Indexdir";
String dataDir = "Data/Datadir";
Indexer indexer;
Searcher searcher;
public static void test() {
LuceneTester tester;
try {
tester = new LuceneTester();
tester.createIndex();
tester.search("malformed");
} catch (IOException e) {
e.printStackTrace();
} catch (ParseException e) {
e.printStackTrace();
}
}
private void createIndex() throws IOException {
indexer = new Indexer(indexDir);
int numIndexed;
long startTime = System.currentTimeMillis();
numIndexed = indexer.createIndex(dataDir, new TextFileFilter());
long endTime = System.currentTimeMillis();
indexer.close();
System.out.println(numIndexed + " File indexed, time taken: "
+ (endTime - startTime) + " ms");
}
private void search(String searchQuery) throws IOException, ParseException {
searcher = new Searcher(indexDir);
long startTime = System.currentTimeMillis();
Term term = new Term(LuceneConstants.CONTENTS, searchQuery);
Query query = new FuzzyQuery(term);
System.out.println("Query: " + query.toString());
TopDocs hits = searcher.search(query, Sort.RELEVANCE);
long endTime = System.currentTimeMillis();
System.out.println(hits.totalHits + " documents found. Time :"
+ (endTime - startTime));
for (ScoreDoc scoreDoc : hits.scoreDocs) {
Document doc = searcher.getDocument(scoreDoc);
System.out.println("File: " + doc.get(LuceneConstants.FILE_PATH));
}
searcher.close();
}
現在,我不想使用默認評分技術,而是使用BM25相似性。怎麼做?
就扔了這一點,在那裏,如果你只是使用Lucene開始,你可能如果可能,最好是學習更新的版本。從3.6開始,Lucene的API已經發生了很大的變化。我相信Lucene 3.6沒有開箱即可使用BM25。在當前版本(6.1)中,BM25實際上是默認的相似度,並且有許多其他選項可用。 – femtoRgon