Lucene奇怪的行爲

我想開始使用lucene。該代碼，我使用的索引文件是：Lucene奇怪的行爲

public void index(String type, String words) { 
     IndexWriter indexWriter = null; 
     try { 
      if (dir == null) 
       dir = createAndPropagate(); 
      indexWriter = new IndexWriter(dir, new StandardAnalyzer(), true, 
        new KeepOnlyLastCommitDeletionPolicy(), 
        IndexWriter.MaxFieldLength.UNLIMITED); 

      Field wordsField = new Field(FIELD_WORDS, words, Field.Store.YES, 
        Field.Index.ANALYZED); 
      Field typeField = new Field(FIELD_TYPE, type, Field.Store.YES, 
        Field.Index.ANALYZED); 

      Document doc = new Document(); 
      doc.add(wordsField); 
      doc.add(typeField); 

      indexWriter.addDocument(doc); 
      indexWriter.commit(); 
     } catch (IOException e) { 
      logger.error("Problems while adding entry to index.", e); 
      } finally { 
      try { 
       if (indexWriter != null) 
        indexWriter.close(); 
      } catch (IOException e) { 
       logger.error("Unable to close index writer.", e); 
      } 
     } 

    }

搜索看起來是這樣的：

public List<TagSearchEntity> searchFor(final String type, String words, 
      int amount) { 
     List<TagSearchEntity> result = new ArrayList<TagSearchEntity>(); 

     try { 
      if (dir == null) 
       dir = createAndPropagate(); 

      for (final Document doc : searchFor(dir, type, words, amount)) { 
       @SuppressWarnings("serial") 
       TagSearchEntity searchResult = new TagSearchEntity() {{ 
        setType(type); 
        setWords(doc.getField(FIELD_WORDS).stringValue()); 
       }}; 
       result.add(searchResult); 
      } 
     } catch (IOException e) { 
      logger.error("Problems while searching", e); 
     } 

     return result; 
    } 

private List<Document> searchFor(Directory indexDirectory, String type, 
      String words, int amount) throws IOException { 
     Searcher indexSearcher = new IndexSearcher(indexDirectory); 

     final Query tagQuery = new TermQuery(new Term(FIELD_WORDS, words)); 
     final Query typeQuery = new TermQuery(new Term(FIELD_TYPE, type)); 

     @SuppressWarnings("serial") 
     BooleanQuery query = new BooleanQuery() {{ 
      add(tagQuery, BooleanClause.Occur.SHOULD); 
      add(typeQuery, BooleanClause.Occur.MUST); 
     }}; 

     List<Document> result = new ArrayList<Document>(); 

     for (ScoreDoc scoreDoc : indexSearcher.search(query, amount).scoreDocs) { 
      result.add(indexSearcher.doc(scoreDoc.doc)); 
     } 

     indexSearcher.close(); 

     return result; 
    }

我有兩個用例。第一個添加某種類型的文檔，然後搜索它，然後添加另一個類型的文檔，然後搜索它，等等。另一個添加所有文檔，然後搜索它們。第一個正常工作：

@Test 
    public void testSearch() { 
     search.index("type1", "test type1 for test purposes test test"); 
     List<TagSearchEntity> result = search.searchFor("type1", "test", 10); 
     assertNotNull("Retrieved list should not be null.", result); 
     assertTrue("Retrieved list should not be empty.", !result.isEmpty()); 

     search.index("type2", "test type2 for test purposes test test"); 
     result.clear(); 
     result = search.searchFor("type2", "test", 10); 
     assertTrue("Retrieved list should not be empty.", !result.isEmpty()); 

     search.index("type3", "test type3 for test purposes test test"); 
     result.clear(); 
     result = search.searchFor("type3", "test", 10); 
     assertTrue("Retrieved list should not be empty.", !result.isEmpty()); 
    }

但其他人似乎只索引的最後一個文件：

@Test 
    public void testBuggy() { 
     search.index("type1", "test type1 for test purposes test test"); 
     search.index("type2", "test type2 for test purposes test test"); 
     search.index("type3", "test type3 for test purposes test test"); 

     List<TagSearchEntity> result = search.searchFor("type3", "test", 10); 
     assertNotNull("Retrieved list should not be null.", result); 
     assertTrue("Retrieved list should not be empty.", !result.isEmpty()); 

     result.clear(); 
     result = search.searchFor("type2", "test", 10); 
     assertTrue("Retrieved list should not be empty.", !result.isEmpty()); 

     result.clear(); 
     result = search.searchFor("type1", "test", 10); 
     assertTrue("Retrieved list should not be empty.", !result.isEmpty()); 
    }

它成功地找到type3，但未能找到所有的人。如果我繞過這些調用，它仍將成功找到最後一個索引文檔。 Lucene的版本，我使用的是：

<dependency> 
     <groupId>org.apache.lucene</groupId> 
     <artifactId>lucene-core</artifactId> 
     <version>2.4.1</version> 
    </dependency> 

    <dependency> 
     <groupId>lucene</groupId> 
     <artifactId>lucene</artifactId> 
     <version>1.4.3</version> 
    </dependency>

我在做什麼錯？如何使它索引所有文件？

來源

2011-01-26 folone

每次索引操作後都會創建一個新的索引。第三個參數是create標誌，它被設置爲true。根據documentation of IndexWriter，如果設置了此標誌，它將創建一個新的索引或覆蓋現有的索引。將其設置爲false以追加到現有索引。

來源

2011-01-26 12:42:36

非常感謝，這完全解決了我的問題。 – folone 2011-01-26 13:13:28

Lucene奇怪的行爲

回答

相關問題