2011-01-26 22 views
0

我想開始使用lucene。該代碼,我使用的索引文件是:Lucene奇怪的行爲

public void index(String type, String words) { 
     IndexWriter indexWriter = null; 
     try { 
      if (dir == null) 
       dir = createAndPropagate(); 
      indexWriter = new IndexWriter(dir, new StandardAnalyzer(), true, 
        new KeepOnlyLastCommitDeletionPolicy(), 
        IndexWriter.MaxFieldLength.UNLIMITED); 

      Field wordsField = new Field(FIELD_WORDS, words, Field.Store.YES, 
        Field.Index.ANALYZED); 
      Field typeField = new Field(FIELD_TYPE, type, Field.Store.YES, 
        Field.Index.ANALYZED); 

      Document doc = new Document(); 
      doc.add(wordsField); 
      doc.add(typeField); 

      indexWriter.addDocument(doc); 
      indexWriter.commit(); 
     } catch (IOException e) { 
      logger.error("Problems while adding entry to index.", e); 
      } finally { 
      try { 
       if (indexWriter != null) 
        indexWriter.close(); 
      } catch (IOException e) { 
       logger.error("Unable to close index writer.", e); 
      } 
     } 

    } 

搜索看起來是這樣的:

public List<TagSearchEntity> searchFor(final String type, String words, 
      int amount) { 
     List<TagSearchEntity> result = new ArrayList<TagSearchEntity>(); 

     try { 
      if (dir == null) 
       dir = createAndPropagate(); 

      for (final Document doc : searchFor(dir, type, words, amount)) { 
       @SuppressWarnings("serial") 
       TagSearchEntity searchResult = new TagSearchEntity() {{ 
        setType(type); 
        setWords(doc.getField(FIELD_WORDS).stringValue()); 
       }}; 
       result.add(searchResult); 
      } 
     } catch (IOException e) { 
      logger.error("Problems while searching", e); 
     } 

     return result; 
    } 

private List<Document> searchFor(Directory indexDirectory, String type, 
      String words, int amount) throws IOException { 
     Searcher indexSearcher = new IndexSearcher(indexDirectory); 

     final Query tagQuery = new TermQuery(new Term(FIELD_WORDS, words)); 
     final Query typeQuery = new TermQuery(new Term(FIELD_TYPE, type)); 

     @SuppressWarnings("serial") 
     BooleanQuery query = new BooleanQuery() {{ 
      add(tagQuery, BooleanClause.Occur.SHOULD); 
      add(typeQuery, BooleanClause.Occur.MUST); 
     }}; 

     List<Document> result = new ArrayList<Document>(); 

     for (ScoreDoc scoreDoc : indexSearcher.search(query, amount).scoreDocs) { 
      result.add(indexSearcher.doc(scoreDoc.doc)); 
     } 

     indexSearcher.close(); 

     return result; 
    } 

我有兩個用例。第一個添加某種類型的文檔,然後搜索它,然後添加另一個類型的文檔,然後搜索它,等等。另一個添加所有文檔,然後搜索它們。第一個正常工作:

@Test 
    public void testSearch() { 
     search.index("type1", "test type1 for test purposes test test"); 
     List<TagSearchEntity> result = search.searchFor("type1", "test", 10); 
     assertNotNull("Retrieved list should not be null.", result); 
     assertTrue("Retrieved list should not be empty.", !result.isEmpty()); 

     search.index("type2", "test type2 for test purposes test test"); 
     result.clear(); 
     result = search.searchFor("type2", "test", 10); 
     assertTrue("Retrieved list should not be empty.", !result.isEmpty()); 

     search.index("type3", "test type3 for test purposes test test"); 
     result.clear(); 
     result = search.searchFor("type3", "test", 10); 
     assertTrue("Retrieved list should not be empty.", !result.isEmpty()); 
    } 

但其他人似乎只索引的最後一個文件:

@Test 
    public void testBuggy() { 
     search.index("type1", "test type1 for test purposes test test"); 
     search.index("type2", "test type2 for test purposes test test"); 
     search.index("type3", "test type3 for test purposes test test"); 

     List<TagSearchEntity> result = search.searchFor("type3", "test", 10); 
     assertNotNull("Retrieved list should not be null.", result); 
     assertTrue("Retrieved list should not be empty.", !result.isEmpty()); 

     result.clear(); 
     result = search.searchFor("type2", "test", 10); 
     assertTrue("Retrieved list should not be empty.", !result.isEmpty()); 

     result.clear(); 
     result = search.searchFor("type1", "test", 10); 
     assertTrue("Retrieved list should not be empty.", !result.isEmpty()); 
    } 

它成功地找到type3,但未能找到所有的人。如果我繞過這些調用,它仍將成功找到最後一個索引文檔。 Lucene的版本,我使用的是:

<dependency> 
     <groupId>org.apache.lucene</groupId> 
     <artifactId>lucene-core</artifactId> 
     <version>2.4.1</version> 
    </dependency> 

    <dependency> 
     <groupId>lucene</groupId> 
     <artifactId>lucene</artifactId> 
     <version>1.4.3</version> 
    </dependency> 

我在做什麼錯?如何使它索引所有文件?

回答

2

每次索引操作後都會創建一個新的索引。第三個參數是create標誌,它被設置爲true。根據documentation of IndexWriter,如果設置了此標誌,它將創建一個新的索引或覆蓋現有的索引。將其設置爲false以追加到現有索引。

+0

非常感謝,這完全解決了我的問題。 – folone 2011-01-26 13:13:28