故事是這樣的。我想在java中使用Lucene索引來模仿關係數據庫的行爲。我需要能夠同時進行搜索(閱讀)和寫作。Java Lucene IndexReader無法正常工作
例如,我想將項目信息保存到索引中。爲了簡單起見,假設該項目有2個字段 - id和name。現在,在將新項目添加到索引之前,我正在搜索具有給定ID的項目是否已經存在。爲此,我使用了IndexSearcher。此操作成功完成(即IndexSearcher返回包含我正在查找的項目ID的文檔的內部文檔ID)。 現在我想實際讀取此項目ID的值,因此我現在使用IndexReader來獲取索引的Lucene文檔,從中我可以提取項目ID字段。 問題是,IndexReader返回一個包含所有字段爲NULL的Document。因此,要重複IndexSearcher的正常工作,IndexReader返回虛假的東西。
我在想,這在某種程度上與文件字段數據在IndexWriter刷新時沒有保存在硬盤上有關。事情是,我第一次做這個索引操作時,IndexReader工作良好。但是,在我的應用程序重新啓動後,會出現上述情況。所以我想第一次圍繞數據漂浮在RAM中,但沒有在硬盤上正確刷新(或者完全自IndexSearcher工作)。
或許它會幫助,如果我給你的源代碼,所以這裏是(你可以安全地忽略tryGetIdFromMemory部分,我使用的,作爲一個速度優化技巧):
public class ProjectMetadataIndexer {
private File indexFolder;
private Directory directory;
private IndexSearcher indexSearcher;
private IndexReader indexReader;
private IndexWriter indexWriter;
private Version luceneVersion = Version.LUCENE_31;
private Map<String, Integer> inMemoryIdHolder;
private final int memoryCapacity = 10000;
public ProjectMetadataIndexer() throws IOException {
inMemoryIdHolder = new HashMap<String, Integer>();
indexFolder = new File(ConfigurationSingleton.getInstance()
.getProjectMetaIndexFolder());
directory = FSDirectory.open(indexFolder);
IndexWriterConfig config = new IndexWriterConfig(luceneVersion,
new WhitespaceAnalyzer(luceneVersion));
indexWriter = new IndexWriter(directory, config);
indexReader = IndexReader.open(indexWriter, false);
indexSearcher = new IndexSearcher(indexReader);
}
public int getProjectId(String projectName) throws IOException {
int fromMemoryId = tryGetProjectIdFromMemory(projectName);
if (fromMemoryId >= 0) {
return fromMemoryId;
} else {
int projectId;
Term projectNameTerm = new Term("projectName", projectName);
TermQuery projectNameQuery = new TermQuery(projectNameTerm);
BooleanQuery query = new BooleanQuery();
query.add(projectNameQuery, Occur.MUST);
TopDocs docs = indexSearcher.search(query, 1);
if (docs.totalHits == 0) {
projectId = IDStore.getInstance().getProjectId();
indexMeta(projectId, projectName);
} else {
int internalId = docs.scoreDocs[0].doc;
indexWriter.close();
indexReader.close();
indexSearcher.close();
indexReader = IndexReader.open(directory);
Document document = indexReader.document(internalId);
List<Fieldable> fields = document.getFields();
System.out.println(document.get("projectId"));
projectId = Integer.valueOf(document.get("projectId"));
}
storeInMemory(projectName, projectId);
return projectId;
}
}
private int tryGetProjectIdFromMemory(String projectName) {
String key = projectName;
Integer id = inMemoryIdHolder.get(key);
if (id == null) {
return -1;
} else {
return id.intValue();
}
}
private void storeInMemory(String projectName, int projectId) {
if (inMemoryIdHolder.size() > memoryCapacity) {
inMemoryIdHolder.clear();
}
String key = projectName;
inMemoryIdHolder.put(key, projectId);
}
private void indexMeta(int projectId, String projectName)
throws CorruptIndexException, IOException {
Document document = new Document();
Field idField = new Field("projectId", String.valueOf(projectId),
Store.NO, Index.ANALYZED);
document.add(idField);
Field nameField = new Field("projectName", projectName, Store.NO,
Index.ANALYZED);
document.add(nameField);
indexWriter.addDocument(document);
}
public void close() throws CorruptIndexException, IOException {
indexReader.close();
indexWriter.close();
}
}
更精確地發生在所有的問題,這一點,如果:
if (docs.totalHits == 0) {
projectId = IDStore.getInstance().getProjectId();
indexMeta(projectId, projectName);
} else {
int internalId = docs.scoreDocs[0].doc;
Document document = indexReader.document(internalId);
List<Fieldable> fields = document.getFields();
System.out.println(document.get("projectId"));
projectId = Integer.valueOf(document.get("projectId"));
}
在else分支... 我不知道什麼是錯。
是啊!就是這樣。被很多索引弄糊塗了。謝謝! – 2011-05-16 21:07:24