我的Lucene Java實現正在吃掉太多的文件。我遵循了Lucene Wiki中關於太多打開文件的說明,但這隻能幫助減緩這個問題。這裏是我的代碼對象(PTicket)添加到索引:Lucene Java打開太多文件。我正確使用IndexWriter嗎?
//This gets called when the bean is instantiated
public void initializeIndex() {
analyzer = new WhitespaceAnalyzer(Version.LUCENE_32);
config = new IndexWriterConfig(Version.LUCENE_32, analyzer);
}
public void addAllToIndex(Collection<PTicket> records) {
IndexWriter indexWriter = null;
config = new IndexWriterConfig(Version.LUCENE_32, analyzer);
try{
indexWriter = new IndexWriter(directory, config);
for(PTicket record : records) {
Document doc = new Document();
StringBuffer documentText = new StringBuffer();
doc.add(new Field("_id", record.getIdAsString(), Field.Store.YES, Field.Index.ANALYZED));
doc.add(new Field("_type", record.getType(), Field.Store.YES, Field.Index.ANALYZED));
for(String key : record.getProps().keySet()) {
List<String> vals = record.getProps().get(key);
for(String val : vals) {
addToDocument(doc, key, val);
documentText.append(val).append(" ");
}
}
addToDocument(doc, DOC_TEXT, documentText.toString());
indexWriter.addDocument(doc);
}
indexWriter.optimize();
} catch (Exception e) {
e.printStackTrace();
} finally {
cleanup(indexWriter);
}
}
private void cleanup(IndexWriter iw) {
if(iw == null) {
return;
}
try{
iw.close();
} catch (IOException ioe) {
logger.error("Error trying to close index writer");
logger.error("{}", ioe.getClass().getName());
logger.error("{}", ioe.getMessage());
}
}
private void addToDocument(Document doc, String field, String value) {
doc.add(new Field(field, value, Field.Store.YES, Field.Index.ANALYZED));
}
編輯添加的代碼搜索
public Set<Object> searchIndex(AthenaSearch search) {
try {
Query q = new QueryParser(Version.LUCENE_32, DOC_TEXT, analyzer).parse(query);
//search is actually instantiated in initialization. Lucene recommends this.
//IndexSearcher searcher = new IndexSearcher(directory, true);
TopDocs topDocs = searcher.search(q, numResults);
ScoreDoc[] hits = topDocs.scoreDocs;
for(int i=start;i<hits.length;++i) {
int docId = hits[i].doc;
Document d = searcher.doc(docId);
ids.add(d.get("_id"));
}
return ids;
} catch (Exception e) {
e.printStackTrace();
return null;
}
}
此代碼是一個Web應用程序。
1)這是建議的方式來使用IndexWriter(實例化每個添加索引新的一個)?
2)我讀過提高ulimit會有所幫助,但這看起來像是一種無法解決實際問題的創可貼。
3)問題可能出在IndexSearcher上嗎?
只是增加服務器上的文件描述符的數量 –