C＃：在長時間運行的SQL閱讀器循環中強制執行全新運行？

我有一個SQL數據讀取器，從SQL數據庫表中讀取2列。一旦它完成了它的位，然後再次開始選擇另外2列。C＃：在長時間運行的SQL閱讀器循環中強制執行全新運行？

我會一口氣把所有的東西都拿下來，但是這會帶來另外一系列的挑戰。

我的問題是，該表包含大量的數據（大約300萬行左右），這使得整個集合有點問題。

我試圖驗證字段值，所以我拉動ID列然後其中一個cols和通過驗證管道中的結果存儲在另一個數據庫中的列中運行每個值。

我的問題是，當閱讀器碰到handlin的一列結束時，我需要強制它立即清理每個使用的RAM塊，因爲此過程使用大約700MB，並且它有大約200列要通過。

沒有完整的垃圾收集我肯定會用完ram。

任何人有任何想法我可以做到這一點？

我使用了很多小型的可重用對象，我的想法是我可以在每個讀週期結束時調用GC.Collect（），並且會刷新所有內容，不幸的是，這並不是因爲某種原因。

好吧，我希望這符合但這裏的問題的方法...

void AnalyseTable(string ObjectName, string TableName) 
{ 
    Console.WriteLine("Initialising analysis process for SF object \"" + ObjectName + "\""); 
    Console.WriteLine(" The data being used is in table [" + TableName + "]"); 
    // get some helpful stuff from the databases 
    SQLcols = Target.GetData("SELECT Column_Name, Is_Nullable, Data_Type, Character_Maximum_Length FROM information_schema.columns WHERE table_name = '" + TableName + "'"); 
    SFcols = SchemaSource.GetData("SELECT * FROM [" + ObjectName + "Fields]"); 
    PickLists = SchemaSource.GetData("SELECT * FROM [" + ObjectName + "PickLists]"); 

    // get the table definition 
    DataTable resultBatch = new DataTable(); 
    resultBatch.TableName = TableName; 
    int counter = 0; 

    foreach (DataRow Column in SQLcols.Rows) 
    { 
     if (Column["Column_Name"].ToString().ToLower() != "id") 
      resultBatch.Columns.Add(new DataColumn(Column["Column_Name"].ToString(), typeof(bool))); 
     else 
      resultBatch.Columns.Add(new DataColumn("ID", typeof(string))); 
    } 
    // create the validation results table 
    //SchemaSource.CreateTable(resultBatch, "ValidationResults_"); 
    // cache the id's from the source table in the validation table 
    //CacheIDColumn(TableName); 

    // validate the source table 
    // iterate through each sql column 
    foreach (DataRow Column in SQLcols.Rows) 
    { 
     // we do this here to save making this call a lot more later 
     string colName = Column["Column_Name"].ToString().ToLower(); 
     // id col is only used to identify records not in validation 
     if (colName != "id") 
     { 
      // prepare to process 
      counter = 0; 
      resultBatch.Rows.Clear(); 
      resultBatch.Columns.Clear(); 
      resultBatch.Columns.Add(new DataColumn("ID", typeof(string))); 
      resultBatch.Columns.Add(new DataColumn(colName, typeof(bool))); 

      // identify matching SF col 
      foreach (DataRow SFDefinition in SFcols.Rows) 
      { 
       // case insensitive compare on the col name to ensure we have a match ... 
       if (SFDefinition["Name"].ToString().ToLower() == colName) 
       { 
        // select the id column and the column data to validate (current column data) 
        using (SqlCommand com = new SqlCommand("SELECT ID, [" + colName + "] FROM [" + TableName + "]", new SqlConnection(ConfigurationManager.ConnectionStrings["AnalysisTarget"].ConnectionString))) 
        { 
         com.Connection.Open(); 
         SqlDataReader reader = com.ExecuteReader(); 

         Console.WriteLine(" Validating column \"" + colName + "\""); 
         // foreach row in the given object dataset 
         while (reader.Read()) 
         { 
          // create a new validation result row 
          DataRow result = resultBatch.NewRow(); 
          bool hasFailed = false; 
          // validate it 
          object vResult = ValidateFieldValue(SFDefinition, reader[Column["Column_Name"].ToString()]); 
          // if we have the relevant col definition lets decide how to validate this value ... 
          result[colName] = vResult; 

          if (vResult is bool) 
          { 
           // if it's deemed to have failed validation mark it as such 
           if (!(bool)vResult) 
            hasFailed = true; 
          } 

          // no point in adding rows we can't trace 
          if (reader["id"] != DBNull.Value && reader["id"] != null) 
          { 
           // add the failed row to the result set 
           if (hasFailed) 
           { 
            result["id"] = reader["id"]; 
            resultBatch.Rows.Add(result); 
           } 
          } 

          // submit to db in batches of 200 
          if (resultBatch.Rows.Count > 199) 
          { 
           counter += resultBatch.Rows.Count; 
           Console.Write(" Result batch completed,"); 
           SchemaSource.Update(resultBatch, "ValidationResults_"); 
           Console.WriteLine("  committed " + counter.ToString() + " fails to the database so far."); 
           Console.SetCursorPosition(0, Console.CursorTop-1); 
           resultBatch.Rows.Clear(); 
          } 
         } 
         // get rid of these likely very heavy objects 
         reader.Close(); 
         reader.Dispose(); 
         com.Connection.Close(); 
         com.Dispose(); 
         // ensure .Net does a full cleanup because we will need the resources. 
         GC.Collect(); 

         if (resultBatch.Rows.Count > 0) 
         { 
          counter += resultBatch.Rows.Count; 
          Console.WriteLine(" All batches for column complete,"); 
          SchemaSource.Update(resultBatch, "ValidationResults_"); 
          Console.WriteLine("  committed " + counter.ToString() + " fails to the database."); 
         } 
        } 
       } 
      } 
     } 

     Console.WriteLine(" Completed processing column \"" + colName + "\""); 
     Console.WriteLine(""); 
    } 

    Console.WriteLine("Object processing complete."); 
}

來源

2010-05-26 War

爲什麼你不能在SQL Server中運行驗證？ – 2010-05-26 16:28:11

驗證過程涉及一些無法在SQL中實現的複雜規則。值可以根據其他數據庫中的值設置，甚至可能在其他網絡上。 – War 2010-05-27 10:31:06

看來，這不是一個容易解決的問題......現在我在更多內存的機器上運行此代碼，我確實讓它更清潔一點，但它仍然將我的內存使用量推到了4.5GB，而且我避難沒有在一張大桌子上試過它（有一張包含大約9000萬張唱片）。 – War 2010-06-04 16:07:10

你可以試試this article中提到的方法。

來源

2010-05-26 16:25:10

我沒有意識到你不得不然後告訴應用程序等待...我想它會剛剛開始與GC.Collect（）被稱爲...我會試一試。 – War 2010-05-27 10:39:38

這似乎是提供的最佳解決方案...它不是理想的，但似乎有一點幫助... – War 2010-06-04 16:07:48

Open the reader with Sequential access，它可能給你所需要的行爲。另外，假設這是一個blob，你也可以通過大塊閱讀更好。

爲DataReader處理包含具有較大二進制值的列的行提供了一種方法。 SequentialAccess不是加載整個行，而是使DataReader能夠將數據作爲流加載。然後，您可以使用GetBytes或GetChars方法指定一個字節位置來啓動讀取操作，併爲返回的數據指定一個有限的緩衝區大小。

當您指定SequentialAccess時，即使不需要讀取每列，也需要按照它們返回的順序從列中讀取。一旦讀取了返回的數據流中的某個位置後，該位置之前或之前的數據就不能再從DataReader讀取。在使用OleDbDataReader時，您可以重新讀取當前列值，直到讀取它爲止。當使用SqlDataReader時，您可以只讀取一次列值。

來源

2010-05-26 16:22:49 eglasius

我沒有在單個字段中的數據大小的問題，只是我需要閱讀的字段數量龐大。 – War 2010-05-27 13:32:50

@Wardy好的，重新讀我的答案，因爲它解決了^ -^ – eglasius 2010-05-27 14:19:20

你能發表一些代碼嗎？ .NET的數據讀取器應該是一個對內存很吝嗇的'消防軟管'，除非像弗雷迪所說的那樣，你的列數據值很大。這種驗證+數據庫寫入需要多長時間？

一般情況下，如果GC需要並且可以完成，它就會完成。我可能聽起來像一個破碎的記錄，但如果你必須GC.Collect（）其他東西是錯誤的。

來源

2010-05-26 16:29:20 n8wrl

+1我同意，強制垃圾收集似乎在這種情況下顯然是錯誤的。此外，如果沒有順序訪問，讀者將保持內部引用OP的意圖無論如何釋放的實例。 – eglasius 2010-05-26 16:40:39

C＃：在長時間運行的SQL閱讀器循環中強制執行全新運行？

回答

相關問題