Solrnet/Tomcat 7 - 寫入幾個大文件內存消耗增長驚人

我正在寫一個非常大（包括大小和數量）的文件到索引索引（包含許多數字和一些文本字段的100個字段）。我在W7 x64上使用Tomcat 7。Solrnet/Tomcat 7 - 寫入幾個大文件內存消耗增長驚人

基於@ Maurico的suggestion when indexing millions of documents我並行寫入操作（請參見下面的代碼示例）

到Solr方法寫入正在「任務」從主循環（注編出來：我任務出來，因爲寫操作時間太長，舉起主應用程序）

問題是內存消耗增長失控，罪魁禍首是solr寫入操作（當我評論他們的運行工作正常）。我該如何處理這個問題？通過Tomcat？或SolrNet？

感謝您的建議。

 //main loop: 
     { 
       : 
       : 
       : 
      //indexDocsList is the list I create in main loop and "chunk" it out to send to the task. 
       List<IndexDocument> indexDocsList = new List<IndexDocument>(); 
       for(int n = 0; n< N; n++) 
       { 
        indexDocsList.Add(new IndexDocument{X=1, Y=2.....}); 
        if(n%5==0) //every 5th time we write to solr 
        { 
        var chunk = new List<IndexDocument>(indexDocsList); 
        indexDocsList.Clear(); 
        Task.Factory.StartNew(() => WriteToSolr(chunk)).ContinueWith(task => chunk.Clear()); 
        GC.Collect(); 
        } 
       } 
     } 

     private void WriteToSolr(List<IndexDocument> indexDocsList) 
     { 

      try 
      { 
       if (indexDocsList == null) return; 
       if (indexDocsList.Count <= 0) return; 
       int fromInclusive = 0; 
       int toExclusive = indexDocsList.Count; 
       int subRangeSize = 25; 

       //TO DO: This is still leaking some serious memory, need to fix this 
       ParallelLoopResult results = Parallel.ForEach(Partitioner.Create(fromInclusive, toExclusive, subRangeSize), (range) => 
       { 
        _solr.AddRange(indexDocsList.GetRange(range.Item1, range.Item2 - range.Item1)); 
        _solr.Commit(); 
       }); 


       indexDocsList.Clear(); 
       GC.Collect(); 
      } 
      catch (Exception ex) 
      { 
       logger.ErrorException("WriteToSolr()", ex); 
      } 
      finally 
      { 

       GC.Collect(); 
      }; 
      return; 
     }

來源

2012-12-02 Mikos

恕我直言，此代碼過於複雜...爲什麼不只是使用我發佈在我的博客上的代碼？ –

@Maurico - 這會有什麼不同？我只使用不同的並行化例程。 – Mikos

我想我的擔心是Tomcat似乎在咀嚼大量的記憶，我在做一些根本性的錯誤？ – Mikos

您正在每批後手工提交。這是Solr最昂貴的操作。在你的情況下，我會建議每x秒自動提交，並做一個softAutoCommit（Solr 4.0）功能。這應該關注Solr的一面。你還必須調整你的JVM垃圾收集選項，這樣你纔不會暫停世界GC。

來源

2012-12-04 14:34:03 zbugs

Solrnet/Tomcat 7 - 寫入幾個大文件內存消耗增長驚人

回答

相關問題