Lucene的.NET的IndexWriter DeleteDocuments不工作

下面是代碼：Lucene的.NET的IndexWriter DeleteDocuments不工作

Try 
     Dim util As New IndexerUtil() 
     Dim dir As Lucene.Net.Store.Directory = FSDirectory.Open(New DirectoryInfo(util.getIndexDir())) 
     Dim indexWriter As New IndexWriter(dir, New SimpleAnalyzer(), indexWriter.MaxFieldLength.UNLIMITED) 

     Dim numDocs As Integer = indexWriter.NumDocs() 

     indexWriter.DeleteDocuments(New Term("id", insightId)) 
     indexWriter.Optimize() 
     indexWriter.Commit() 
     indexWriter.Close() 
     numDocs = indexWriter.NumDocs() 

    Catch ex As Exception 
     LOG.Error("Could not remove insight " + insightId + " from index", ex) 
    End Try

numDocs = 85這兩個時間

我也有一個小的GUI應用程序我寫了讀取索引並打印文檔出一個不錯的格式。具有等於insightId的id字段的文檔肯定存在，並且在「刪除」之後存在。

下面是id字段正在創建

doc.Add(New Field("id", insightID, Field.Store.YES, Field.Index.ANALYZED)) //insightID is an integer

來源

2011-06-30 ryoung

你在創建索引時如何創建id字段？你可以發佈代碼嗎？另外，代碼是否會拋出任何異常？ –

正如您可能發現與您的more recent post一樣，由於SimpleAnalyzer使用的是LetterTokenizer，因此您的ID列未正確編入索引，該列僅返回字母。

考慮使用KeywordAnalyzer而不是id字段。

來源

2011-06-30 17:14:44

謝謝！ – ryoung

如果id字段是一個整數，並且它可能只用於跟蹤文檔，那麼將其存儲爲NOT_ANALYZED字段不是更好嗎？這樣他根本不用擔心分析儀。 –

@喬治，你可能是對的。如果OP將解析包含'id：1234'的查詢，那麼OP仍然需要一個合適的分析器，但這通常是在一個非常不同的代碼區域。 –

由於SimpleAnalyzer轉換輸入文本爲小寫，您將在指數已經小寫的條款。你確定「insightId」也是小寫嗎？

來源

2011-06-30 14:21:52 Anonymous

insightId是一個整數值（我將它轉換爲一個字符串），因此沒有大寫/小寫。 – ryoung

您應該創建一個新的IndexWriter，而不是對已關閉的文檔進行計數。

來源

2011-06-30 16:41:51 mathieu

是的，我發現後發佈這一點。但是，這仍然不會改變文件不被刪除的事實。 – ryoung

Lucene的.NET的IndexWriter DeleteDocuments不工作

回答

相關問題