我正在處理一個List <>集合，在兩個嵌套循環內向集合添加新對象。在循環完成執行後，有約50萬個項目被添加到集合中。C＃List <> Add（）方法的性能

首先，adition操作運行良好，但不久之後可以注意到性能下降，對於最後幾千個元素來說，延遲時間是無法忍受的。

我嘗試了各種技巧（初始化一定大小的集合 - 500000），用LinkedList替換列表<>集合，但它沒有太多幫助。

你能推薦一個提示來解決問題嗎？我更感興趣的是用更優化的結構來更改結構 - 例如，使用諸如加法之類的操作來執行比列表<>更好的結構。

方法，該方法更新列表

private void UpdateForecastList(ConcurrentDictionary<Int32, RegistroSalidaProductoPrevision> prediccion, bool soloMejoresMetodos = true) 
    { 
     foreach (KeyValuePair<int, RegistroSalidaProductoPrevision> kvp in prediccion) 
     { 
      KeyValuePair<int, RegistroSalidaProductoPrevision> localKvp = kvp; 

      IList<Prediccion> pExistente = prediccionList.Where(p => p.Id == localKvp.Key).ToList(); 

      Articulo articulo = (articuloList.Where(a => a.Id == localKvp.Key)).First(); 

      if (pExistente.Count > 0) 
      { 
       foreach (var p in pExistente) 
       { 
        prediccionList.Remove(p); 
       } 
      } 

      if (kvp.Value.Previsiones.Count > 0) 
      { 
       var previsiones = kvp.Value.Previsiones.Where(prevision => prevision.Value.LPrevision[1] != null).ToList(); 
       int previsionesCount = previsiones.Count; 

       for (int a = 0; a < previsionesCount; a++) 
       { 
        var registros = previsiones[a].Value.LPrevision[1].Serie; 
        int c = registros.Count; 

        if (soloMejoresMetodos) 
        { 
         if (localKvp.Value.MejorMetodo != previsiones[a].Key) continue; 
         for (int i = 0; i < c; i++) 
         { 
          var p = new Prediccion() 
             { 
              Id = articulo.Id, 
              Nombre = articulo.Codigo, 
              Descripcion = articulo.Descripcion, 
              NombreMetodo = 
               Utils.SplitStringByCapitals(previsiones[a].Value.NombreMetodo), 
              Fecha = registros[i].Fecha, 
              PrediccionArticulo = Math.Round(registros[i].Cantidad, 2), 
              EsMejorMetodo = 
               (previsiones[a].Value.NombreMetodo == localKvp.Value.MejorMetodo) 
                ? true 
                : false 
             }; 

          // This line experiences performance loss 
          prediccionList.Add(p); 
         } 
        } 
        else 
        { 
         for (int i = 0; i < c; i++) 
         { 
          prediccionList.Add(new Prediccion() 
                { 
                 Id = articulo.Id, 
                 Nombre = articulo.Codigo, 
                 Descripcion = articulo.Descripcion, 
                 NombreMetodo = previsiones[a].Value.NombreMetodo, 
                 Fecha = registros[i].Fecha, 
                 PrediccionArticulo = 
                  Math.Round(registros[i].Cantidad, 2), 
                 EsMejorMetodo = 
                  (previsiones[a].Value.NombreMetodo == 
                  localKvp.Value.MejorMetodo) 
                   ? true 
                   : false 
                }); 
         } 
        } 
       } 
      } 
      else 
      { 
       prediccionList.Add(new Prediccion() 
             { 
              Id = articulo.Id, 
              Nombre = articulo.Codigo, 
              Descripcion = articulo.Descripcion, 
              NombreMetodo = kvp.Value.ErroresDatos[0].Texto, 
             }); 
      } 
     } 
    }

該方法的小描述： - 該方法讀取的對象（併發字典），並與更新的列表（在這種情況下，鏈表）預測對應某篇文章。

併發字典對象不斷從各個併發訪問它的線程更新。

該列表使用對應於所有文章的空預測進行初始化;因此，例如，如果您有700篇文章，那麼在開始時，列表中將填充700個空白預測。

由於其中一個計算線程更新了conformant字典，因此會引發一個調用上述方法的事件，該事件輪流更新列表（predicateList）。

可以在prediccionList（本例中）中保存的記錄的最大數量約爲500000條記錄，但是在列表中添加大約40000條記錄後，可能會發現性能損失。

該代碼可能看起來有點生疏，因爲我嘗試了各種優化技巧（用for替換foreach'es，計算循環外的count，用LinkedList替換列表<>等）。最後，我得出結論：減慢執行時間的部分是「prediccionList.Add（p）;」行。

添加到列表中的對象是Prediccion類的實例;這個對象我認爲不是非常heacy，它只包含7個字段。

內存使用情況

我附加了內存分析的結果。使用的內存不超過256 MB，因此我不認爲這裏的內存應該是一個問題。 enter image description here

來源

2011-07-29 Linus

從哪裏得到500000個物品？ –

你能提供一個能夠重現問題的代碼示例嗎？ – alun

正在添加什麼類型的對象？ – jalf

-2

您可以使用速度更快（但不可查詢）的數組。我不知道你的代碼的細節，但你可能想要折射和使用數據庫。 500000項永遠不會很快

來源

2011-07-29 10:48:59

如何使用數組而不是List？您可以將其初始化爲初始大小（假設有500000個元素），如果這還不夠，請使用Array.Resize添加另一個100000.您只需跟蹤實際的元素數量，因爲Length屬性只會給您元素的數量。

但是，請注意，調用Array.Resize也可能很耗時，因爲基本上，將生成新大小的新數組，並且原始數組中的所有元素都將被複制到新數組中。你不應該經常這麼叫。

來源

2011-07-29 10:49:23

這會有幫助嗎？與初始化具有適當大小的列表有什麼不同，就像他已經做的那樣？ – jalf

@jalf，我不確定具體細節，但下面的列表使用數組。我會想象有很多創建新的數組，並複製涉及這麼多元素的數據。我可能是完全錯誤的，雖然:) –

@jalf，我只是假設'List <>'的實現引入比「普通數組」更多的開銷。我並不是說使用'Array'更快或更好，我只是建議嘗試一下，看看會發生什麼。 –

如果要添加到列表中的對象的大小很大，則可能會受到內存限制。

如果您的進程是32位，那麼在用完地址空間之前總共需要2GB，但如果是64位，則可能會輕鬆超出機器中的物理內存並啓動分頁到磁盤。

你的物體有多大？

來源

2011-07-29 10:53:59

我不知道你是如何計算2GB的，但是，32位是4GB。 – xxxxxxxxxadfas

32位進程分配2GB的4GB地址空間。如果進程是「大型地址識別」，則他們可以在32位Windows上訪問3GB，在64位Windows上訪問4GB。 64位Windows上的64位進程可以訪問8TB。 –

謝謝！一些鏈接爬行？ – xxxxxxxxxadfas

根據我的經驗，List<T>的性能取決於內存。它總是遵循相同的模式，插入速度快到一個點，然後性能急劇下降。在我的機器上，通常發生在我打1.2G記憶標記時。幾乎所有我試過的集合都有同樣的問題，所以我認爲它更像是一個.net潛在問題，而不是一個List<T>問題。

我會建議嘗試減少使用500.000的東西的物體大小（用int s代替long s等），然後嘗試一下。
但請注意，即使您管理它在您的計算機上快速運行，它可能已超過部署該應用程序的計算機的閾值。

來源

2011-07-29 10:55:42 SWeko

隨着你的清單變大，每次當它是擴大收集垃圾，框架複製其內容到一個新的列表位置，因爲垃圾收集器是如何工作的方式的時間。這就是爲什麼隨着它變得越來越大，它變得越來越慢。（GC on MSDN）

可能的解決方案（我能想到的）是使用一個預定義大小的列表或數組，它肯定不會填滿，或者如果這不是一個選項，那麼使用System.Collections.Generic。 LinkedList，但正如您已經嘗試過的那樣，您可能必須實現自定義列表，如果適用的話，單鏈接（LinkedList是雙鏈接的）。

爲了增加獲得良好答案的機會，您應該發佈您在收藏中保留的對象的代碼，並將其添加到添加項目的位置，以便我們可以更好地理解關於什麼。

此外，請看看http://www.simple-talk.com/dotnet/performance/the-top-5-.net-memory-management-misconceptions/，我認爲它會對你有所幫助。

UPDATE：索引應該是廉價的操作，但儘管如此，你可以嘗試讀取previsiones [A]（和registros [i]於嵌套循環）插入循環的開始局部變量，你能救幾個indexings的（如果clr沒有優化它，x 100000次迭代可能會有所不同）。

來源

2011-08-01 10:28:08

>每次擴展時，框架都會將其內容複製到新列表中這是我寫過的最不可思議的小說閱讀，自愛麗絲在仙境。繼續寫戈蘭。 –

@BoppityBop修復了違規詞。你是對的，一個不知道GC如何工作的人會被誤導，認爲每次將一個對象添加到列表中。 –

您是否嘗試過在初始化時提供容量？因此它不需要重新分配內存並將舊內容帶入新的內存空間。

List<long> thelist = new List<long>(500000);

來源

2014-10-20 12:47:25 tesla

使用Struct而不是Class可顯着提高您的性能。

您還可以通過丟失Prediccion類/ Struct中的字符串屬性來獲得性能。

我很感興趣，很長一段時間的實際影響，所以這裏是我的基準：

我把不同的數據結構，並把20個億個對象/結構在其中。下面是結果：

 
List: 
Adding 20000000 TestClass to a List`1 took 3563,2068 ms 
Accessing 20000000 TestClass from a List took 103,0203 ms 
Adding 20000000 TestStruct to a List`1 took 2239,9639 ms 
Accessing 20000000 TestStruct from a List took 254,3245 ms 

Initialized List: 
Adding 20000000 TestClass to a List`1 took 3774,772 ms 
Accessing 20000000 TestClass from a List took 99,0548 ms 
Adding 20000000 TestStruct to a List`1 took 1520,7765 ms 
Accessing 20000000 TestStruct from a List took 257,5064 ms 

LinkedList: 
Adding 20000000 TestClass to a LinkedList`1 took 6085,6478 ms 
Adding 20000000 TestStruct to a LinkedList`1 took 7771,2243 ms 

HashSet: 
Adding 20000000 TestClass to a HashSet`1 took 10816,8488 ms 
Adding 20000000 TestStruct to a HashSet`1 took 3694,5187 ms 

Now I added a string to the class/struct: 
List: 
Adding 20000000 TestClassWithString to a List`1 took 4925,1215 ms 
Accessing 20000000 TestClassWithString from a List took 120,0348 ms 
Adding 20000000 TestStructWithString to a List`1 took 3554,7463 ms 
Accessing 20000000 TestStructWithString from a List took 456,3299 ms

這是我的測試程序：

static void Main(string[] args) 
    { 
     const int noObjects = 20*1000*1000; 

     Console.WriteLine("List:"); 
     RunTest(new List<TestClass>(), noObjects); 
     RunTest(new List<TestStruct>(), noObjects); 
     Console.WriteLine(); 

     Console.WriteLine("Initialized List:"); 
     RunTest(new List<TestClass>(noObjects), noObjects); 
     RunTest(new List<TestStruct>(noObjects), noObjects); 
     Console.WriteLine(); 

     Console.WriteLine("LinkedList:"); 
     RunTest(new LinkedList<TestClass>(), noObjects); 
     RunTest(new LinkedList<TestStruct>(), noObjects); 
     Console.WriteLine(); 

     Console.WriteLine("HashSet:"); 
     RunTest(new HashSet<TestClass>(), noObjects); 
     RunTest(new HashSet<TestStruct>(), noObjects); 
     Console.WriteLine(); 

     Console.WriteLine("Now I added a string to the class/struct:"); 
     Console.WriteLine("List:"); 
     RunTest(new List<TestClassWithString>(), noObjects); 
     RunTest(new List<TestStructWithString>(), noObjects); 
     Console.WriteLine(); 

     Console.ReadLine(); 
    } 




    private static void RunTest<T>(ICollection<T> collection, int noObjects) where T : ITestThing 
    { 
     Stopwatch sw = new Stopwatch(); 
     sw.Restart(); 
     for (int i = 0; i < noObjects; i++) 
     { 
      var obj = Activator.CreateInstance<T>(); 
      obj.Initialize(); 
      collection.Add(obj); 
     } 
     sw.Stop(); 
     Console.WriteLine("Adding " + noObjects + " " + typeof(T).Name + " to a " + collection.GetType().Name + " took " + sw.Elapsed.TotalMilliseconds + " ms"); 

     if (collection is IList) 
     { 
      IList list = (IList) collection; 
      // access all list objects 
      sw.Restart(); 
      for (int i = 0; i < noObjects; i++) 
      { 
       var obj = list[i]; 
      } 
      sw.Stop(); 
      Console.WriteLine("Accessing " + noObjects + " " + typeof (T).Name + " from a List took " + sw.Elapsed.TotalMilliseconds + " ms"); 
     } 
    }

的TestClass和TestStruct看起來都像這樣（一個 '階級'，一個用 '結構'）：

public class TestClass : ITestThing 
{ 
    public int I1; 
    public int I2; 
    public double D1; 
    public double D2; 
    public long L1; 
    public long L2; 

    public void Initialize() 
    { 
     D1 = 1; 
     D2 = 2; 
     I1 = 3; 
     I2 = 4; 
     L1 = 5; 
     L2 = 6; 
    } 
}

只有TestStruct是public struct而不是public class和TestClassWithString和TestStructWithString public string S1它用「abc」初始化。

ITestThing就在那裏，因爲結構體不能有構造函數，所以我需要某種方式以通用的方式調用Initialize（）方法，但事實證明，如果我調用Initialize（）方法並沒有太大區別）或不。

請注意，如果我爲每個沒有任何Interface或Activator.CreateInstance的測試用例編寫代碼，那麼持續時間的差異將會更加極端，但是當我添加第二個測試時代碼變得太大了案例...

摘要

可以大大利用列表的初始大小改善你的表現，並把中的Structs它，而不是類實例（對象）。還要儘量避免在Structs中使用字符串，因爲每個String實例都是您試圖通過使用Struct而不是Object來避免的對象。

來源

2016-01-15 10:25:01 JCH2k

該問題與List或任何其他.NET數據結構的性能無關。你的問題純粹是算法。例如，你有這樣的代碼片段：

foreach (KeyValuePair<int, RegistroSalidaProductoPrevision> kvp in prediccion) 
    { 
     KeyValuePair<int, RegistroSalidaProductoPrevision> localKvp = kvp; 

     IList<Prediccion> pExistente = prediccionList.Where(p => p.Id == localKvp.Key).ToList(); 

     Articulo articulo = (articuloList.Where(a => a.Id == localKvp.Key)).First();

因此在詞典（prediccion）每一個項目，你遍歷整個prediccionList。你已經實現了n^2算法。執行該方法所需的時間量與prediccion.Count * prediccionList.Count成正比。

您需要更好的算法;不是更快的收集數據結構。

來源

2016-01-15 19:22:11

buhahaha ..給出真正答案的唯一的傢伙有0 upvotes和他的答案是在最後..一個人必須愛SO .. –

C＃List <> Add（）方法的性能

方法，該方法更新列表

內存使用情況

回答

相關問題