使用通配符進行lucene搜索的速度很慢

我有1000個文件的列表（每年增長兩倍），僅包含文本和每個文件的大小〜8Mb，我試圖找到文件名（s）（通配符）表達式。使用通配符進行lucene搜索的速度很慢

實施例中，所有文件都包含這樣的數據

COD1004129641208240002709991455671866 4IT /福林4400QQQUF 3300QQQUF

和我的搜索可能是：「* 9991455671866」，其具有匹配於上述的行。

問題是（也可能是我的期望太高）需要一分多鐘才能返回結果。

我的文檔索引是這樣的：

private Document getDocument(File file) throws IOException 
{ 
    FileReader reader = new FileReader(file); 
    Document doc = new Document(); 
    doc.add(new Field(IndexProperties.FIELD_FILENAME, file.getName(), Field.Store.YES, Field.Index.NOT_ANALYZED)); 
    doc.add(new Field(IndexProperties.FIELD_CONTENT, reader)); 

    return doc; 
}

分析儀

 Directory fsDir = FSDirectory.open(new File(indexFolder)); 
     Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_36); 

     // build the writer 
     IndexWriterConfig indexWriter = new IndexWriterConfig(Version.LUCENE_36, analyzer); 
     IndexWriter writer = new IndexWriter(fsDir, indexWriter);

和通配符搜索是：

public List<String> findFilenameByContent(String wildCardContent, String INDEX_FOLDER, String TICKETS_FOLDER) throws Exception 
{ 
    long start = System.currentTimeMillis(); 
    Term term = new Term(IndexProperties.FIELD_CONTENT, wildCardContent); //eg *9991455671866 
    Query query = new WildcardQuery(term); 

    //loop through docs 
    Directory fsDir = FSDirectory.open(new File(INDEX_FOLDER)); 
    IndexSearcher searcher = new IndexSearcher(IndexReader.open(fsDir)); 
    ScoreDoc[] queryResults = searcher.search(query, 10).scoreDocs; 
    List<String> strs = new ArrayList<String>(); 

    for (ScoreDoc scoreDoc : queryResults) 
    { 
     Document doc = searcher.doc(scoreDoc.doc); 
     strs.add(doc.get(IndexProperties.FIELD_FILENAME)); 
    } 

    searcher.close(); 
    long end = System.currentTimeMillis(); 
    System.out.println("TOTAL SEARCH TIME: "+(end-start)/1000.0+ "secs"); 
    return strs; 
}

來源

2012-09-18 adhg

我看不出有什麼毛病你的代碼。如果你只需要搜索，嘗試：

IndexReader.open(fsDir,true);

它可以提高你的搜索時間。

This suggestions may help.

來源

2012-09-18 17:07:54

謝謝@ fer13488;您的建議從3.6棄用。 – adhg

使用通配符進行lucene搜索的速度很慢

回答

相關問題