1
如何合併2個或多個lucene索引並避免在最終索引中出現重複值?如何合併2個或多個lucene索引並避免在最終索引中重複使用值?
今天,我用這個代碼指標之間做合併:
IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_36, new StandardAnalyzer(Version.LUCENE_36));
IndexWriter writer = new IndexWriter(getFSDirectory(INDEX_DIR), iwc);
LogMergePolicy logMerge = new LogMergePolicy() {
@Override
protected long size(SegmentInfo arg0) throws IOException {
return 0;
}
};
logMerge.setMergeFactor(1000);
iwc.setRAMBufferSizeMB(50);
Directory indexes[] = new Directory[INDEXES_DIR.size()];
for (int i = 0; i < INDEXES_DIR.size(); i++) {
Directory d = FSDirectory.open(new File(INDEXES_DIR.get(i)).getAbsoluteFile());
System.out.println("Adding: " + INDEXES_DIR.get(i));
indexes[i] = d;
}
System.out.print("Merging added indexes...");
writer.addIndexes(indexes);
System.out.println("done");
當有重複時,應該保留哪一個? – jpountz
任何人都可以保持最終指數 – masm