2016-02-12 53 views
0

我有一個問題,我嘗試使用parallel.for()將一些文件加載​​到數據庫中。我的問題是傳遞給數據庫函數的文件ID不知何故不正確。也就是說,數據庫正在返回錯誤的數據。我試圖通過使用並行字典來驗證這一點,以添加具有和不具有並行的ID /名稱對。在我看來,循環結束後的列表應該是相同的。但他們不是。這以非常簡單的方式模擬了我正在做的事情。parallel.for混淆(收集丟失命令)

這是否有道理?:

class Program 
    { 
     ConcurrentDictionary<int, string> _cd = new ConcurrentDictionary<int, string>(); 
     static void Main() 
     { 
      //simulate the situation 
      int[] idList = new int[] {1, 8, 12, 19, 25, 99}; 
      string[] fileList = new string[] {"file1", "file8", "file12", "file19", "file25", "file99"}; 

      //run in serial first 
      ProcessFiles(idList, fileList); 

      //write out pairs to text file 
      foreach (var item in _cd) 
      { 
       var key = _cd.key; 
       var val = _cd.value; 
       string line = string.Format("fileId is {0} and fileName is {1}", key, val); 

       File.AppendAllText(@"c:\serial.txt", line + Environment.NewLine); 
      } 
      //results of text file (all good): 
      //fileId is 1 and fileName is file1 
      //fileId is 8 and fileName is file8 
      //fileId is 12 and fileName is file12 
      //fileId is 19 and fileName is file19 
      //fileId is 25 and fileName is file25 
      //fileId is 99 and fileName is file99 

      _cd.Clear(); 

      //now run in parallel 
      ProcessFilesInParallel(idList, fileList); 

      //write out pairs to text file 
      foreach (var item in _cd) 
      { 
       var key = _cd.key; 
       var val = _cd.value; 
       string line = string.Format("fileId is {0} and fileName is {1}", key, val); 

       File.AppendAllText(@"c:\parallel.txt", line + Environment.NewLine); 
      } 

      //results of text file (1. some, not all, are mismatched and 2. not all elements got added): 
      //fileId is 8 and fileName is file8 
      //fileId is 12 and fileName is file19 
      //fileId is 19 and fileName is file12 
      //fileId is 25 and fileName is file25 
     } 

     private void static ProcessFiles(int[]Ids, string[] files) 
     { 
      int fileId = 0; 
      string fileName = string.Empty; 

      for(var i=0, i<Ids.Count; i++) 
      { 
       fileId = Ids[i]; 
       fileName = GetControlFileMetaDataFromDB(fileId); 

       _cd.TryAdd(fileId, fileName); 
      } 
     } 

     private void static ProcessFilesInParallel(int[]Ids, string[] files) 
     { 
      int fileId = 0; 
      string fileName = string.Empty; 

      Parallel.For(0, Ids.Count, i => 
      { 
       fileId = Ids[i]; 

       //this is returning the wrong fileName 
       fileName = GetControlFileMetaDataFromDB(fileId); 

       _cd.TryAdd(fileId, fileName); 
      } 

      ); 
     } 

     private void static GetControlFileMetaDataFromDB(int fileId) 
     { 
      //removed for brevity: 
      //1. connect to oracle 
      //2. call function, passing file id 
      //3. iterate over data reader and look for the filename 

      while (reader.Read()) 
      { 
       //strip out filename, add it to collection 
       int endPos = reader[0].ToString().IndexOf("txt"); 
       if (endPos != -1) 
       { 
        endPos += 3; 
        int startPos = reader[0].ToString().IndexOf(":\\") - 1; 
        string path = reader[0].ToString().Substring(startPos, endPos - startPos); 
        sring fileName = Path.GetFileName(path); 

        _cd.TryAdd(fileId, fileName); 
        break; 
       } 
      } 
     } 
    } 
+0

請將代碼複製到編輯器中並嘗試編譯它。有大量的錯誤。請修復它們並添加'using'指令,以便其他人可以檢查代碼。 –

+0

我會特別感興趣的是'reader'變量來自何處。難道只有一個數據庫連接,並且您正在從多個線程訪問而沒有同步。 –

+0

這是基本的數據訪問代碼,爲了簡潔起見,我放入評論中。下面的海報釘住了這個問題。 – inspectorGadget

回答

7

您已經聲明fileIdfileName的的Parallel.For,這意味着相同的變量由每次迭代共享。

由於迭代可能很好地在不同線程上並行運行,因此您正在重新分配變量,而另一個同時迭代可能正在使用它們。

你需要做的是將你的變量聲明放在的循環中,所以它們在本地迭代;

Parallel.For(0, Ids.Count, i => 
{ 
    int fileId = Ids[i]; 

    //this is returning the wrong fileName 
    string fileName = GetControlFileMetaDataFromDB(fileId); 

    _cd.TryAdd(fileId, fileName); 
} 
+0

謝謝。我知道這是愚蠢而簡單的事情。它現在正在完美工作。 – inspectorGadget

1

這裏的問題在ProcessFilesInParallel(int[]Ids, string[] files)函數中。 for循環中的迭代將並行執行,並且您在for的範圍之外聲明瞭fileIdfileName,所以這些變量將在所有處於爭用條件下的迭代中共享。

可以解決這個問題,移動forfileIdfileName變量:

private static void ProcessFilesInParallel(int[] Ids, string[] files) 
{ 
    Parallel.For(0, Ids.Length, i => 
    { 
     var fileId = Ids[i]; 

     //this is returning the wrong fileName 
     var fileName = GetControlFileMetaDataFromDB(fileId); 

     _cd.TryAdd(fileId, fileName); 
    }); 
} 

此外,在問題的標題的Parallel.For混亂(收集失去順序)你說集輸秩序。正如您可以閱讀here那樣,並行循環中沒有定義執行順序。