2013-07-18 39 views
0

我正在編寫比較兩組二進制數據之間差異的軟件。二進制數據包含座標,並且還從DBF文件讀取以獲取有關每個對象的屬性信息。我的軟件指示對象是否已移動,如果屬性已更改,是否刪除/添加了對象。使用相同的哈希碼和類似的數據比較兩個哈希集/數據列表

我的自定義對象哈希碼是基於每個實例中的座標列表生成的。 每個對象中還有一個數據行。當我最初試圖找到「不完美的記錄」(基本上是沒有基於座標和數據行匹配的任何對象)時,它會考慮數據行,並且我可以使用HashSet,因爲數據行使得它足夠獨特。

public override int GetHashCode() 
    { 

     if (considerAttrs) 
     { 
      return (value.GetHashCode() + dbString.GetHashCode()); 
     } 
     else 
     { 
      return value.GetHashCode(); 
     } 

    } 

我在哪裏:

此時軟件會返回對象,而一個完美的比賽(對座標沒有精確匹配和屬性)。這個數據是正確的(至少我希望)

origUnfoundCount:114 modUnfoundCount:223

它無法找到原來的114,因爲113已經修改屬性,以及1已被移動。它在修改中找不到223,因爲有113個修改過的屬性,1個已移動,109個已添加。

如果我堅持:

的軟件與精美

我改變了considerAttrs中的每個對象爲false,並且使用的列表,而不是更小的數據(幾百每個列表) HashSet的。速度的懲罰是非常激烈的,我的應用程序只能用於比較數據之間的小差異。

但是我需要使用List,因爲在hashset中不能有重複項,但列表只是減慢速度的一種方法。字典不能使用,因爲密鑰必須是唯一的

我需要一個新的方法,我的邏輯下面應該給你我需要它做的一般要點。

我目前比較代碼

//ignored modified records 
    HashSet<int> ignoredRecNo = new HashSet<int>(); 
    //ignoring moved records 
    HashSet<String> ignoredDBstrings = new HashSet<string>(); 

    HashSet<String> columnNames = new HashSet<string>(); 
    HashSet<int> modModdedRNs = new HashSet<int>(); 
    columnNames = (HashSet<String>)HttpContext.Current.Session["columnNames"]; 

    List<PolyLineZ> originalNFs = new List<PolyLineZ>(); 
    List<PolyLineZ> modifiedNFs = new List<PolyLineZ>(); 

    List<PolyLineZ> removedList = new List<PolyLineZ>(); 
    List<PolyLineZ> movedList = new List<PolyLineZ>(); 
    List<PolyLineZ> modifiedList = new List<PolyLineZ>(); 
    List<PolyLineZ> modifiedMatchList = new List<PolyLineZ>(); 

    List<PolyLineZ> movedOrDeleted = new List<PolyLineZ>(); 

    if (HttpContext.Current.Session["origNFList"] != null) 
    { 
     origPolyLineZNFList = (HashSet<PolyLineZ>)HttpContext.Current.Session["origNFList"]; 
    } 

    if (HttpContext.Current.Session["modNFList"] != null) 
    { 
     modPolyLineZNFList = (HashSet<PolyLineZ>)HttpContext.Current.Session["modNFList"]; 
    } 


    //----------Generate Lists of string for each row---------------// 
    foreach (PolyLineZ polyLineZ in origPolyLineZNFList) 
    { 
     origDBStrings.Add(polyLineZ.dbString); 
     PolyLineZ temp = new PolyLineZ(); 
     temp = polyLineZ; 
     temp.considerAttrs = false; 
     originalNFs.Add(temp); 
    } 

    foreach (PolyLineZ polyLineZ in modPolyLineZNFList) 
    { 
     modDBStrings.Add(polyLineZ.dbString); 
     PolyLineZ temp = new PolyLineZ(); 
     temp = polyLineZ; 
     temp.considerAttrs = false; 
     modifiedNFs.Add(temp); 
    } 


    foreach (PolyLineZ modpolyLineZ in modifiedNFs) 
    { 
     bool foundAmatch = false; 
     foreach (PolyLineZ origPolyLineZ in originalNFs) 
     { 
      if (origPolyLineZ.Equals(modpolyLineZ)) 
      { 
       if (!modDBStrings.Contains(origPolyLineZ.dbString)) 
       { 
        //database modifications are in here       
        modModdedRNs.Add(origPolyLineZ.RecordNumber); 
        foundAmatch = true; 
        break; 
       } 
      } 
     } 

    } 

    foreach (PolyLineZ polyLineZ in originalNFs) 
    { 
     bool foundAmatch = false; 
     foreach (PolyLineZ modpolyLineZ in modifiedNFs) 
     { 
      if (foundAmatch) 
      { 
       break; 
      } 
      if (modpolyLineZ.Equals(polyLineZ)) 
      { 
       if (!origDBStrings.Contains(modpolyLineZ.dbString)) 
       { 

        foundAmatch = true; 
        //database modifications are in here        
        ignoredRecNo.Add(modpolyLineZ.RecordNumber); 
        ignoredDBstrings.Add(modpolyLineZ.dbString); 
        modifiedList.Add(polyLineZ); 
        modifiedMatchList.Add(modpolyLineZ); 
        break; 

       } // end db string comparison 

      } //end shape equals if 

     } //end modNF loop 

     if (!foundAmatch) 
     { 
      movedOrDeleted.Add(polyLineZ); 
      ignoredDBstrings.Add(polyLineZ.dbString); 
      ignoredRecNo.Add(polyLineZ.RecordNumber); 
     } 

    } //end origNF loop 


    result += "movedDeletedCount: " + movedOrDeleted.Count + "<br/>"; 
    foreach (PolyLineZ polylineZ in movedOrDeleted) 
    { 

     if (!modDBStrings.Contains(polylineZ.dbString)) 
     { 
       removedList.Add(polylineZ); 
     } 
     else 
     { 

       movedList.Add(polylineZ); 
     } 
    } 

    /*************************** ITERATE DATABASE CHANGES***********************************/ 
    for(int i=0; i < modifiedList.Count;i++) 
    { 
     if (modModdedRNs.Contains(modifiedList[i].RecordNumber)) 
     { 
      if (modifiedAttrs < 1001) 
      { 
       //database modifications are in here        
       ignoredRecNo.Add(modifiedMatchList[i].RecordNumber); 
       ignoredDBstrings.Add(modifiedMatchList[i].dbString); 
       modifiedAttrs++; 
       modifiedResults += "<div class='turnBlue'>"; 
       //show where the change was made at 
       modifiedResults += "Change Detected at original FID# " + (modifiedList[i].RecordNumber - 1) + " and modified FID#"; 
       HashSet<String> mismatchedColumns = new HashSet<String>(); 
       modifiedResults += (modifiedMatchList[i].RecordNumber - 1); 
       modifiedResults += "</div>"; //end turnblue div 
       DataRow origRow = modifiedList[i].datarow; 
       DataRow modRow = modifiedMatchList[i].datarow; 
       foreach (String columnName in columnNames) 
       { 
        String origRowValue = "" + origRow.Field<Object>(columnName); 
        String modRowValue = "" + modRow.Field<Object>(columnName); 
        if (!modRowValue.Equals(origRowValue)) 
        { 
         mismatchedColumns.Add(columnName); 

        } 
       } 
       foreach (String mismatchedColumn in mismatchedColumns) 
       { 

        //grab original attr value 
        String origMismatchedRowValue = "" + origRow.Field<Object>(mismatchedColumn); 
        //grab the modified value 
        String modMismatchedRowValue = "" + modRow.Field<Object>(mismatchedColumn); 
        //generate a heading, letting the user know about the situation 
        modifiedResults += "<div class='turnBlue'>Value at Column: &nbsp;<b>" + mismatchedColumn + "</b> has been modified<br/>"; 
        modifiedResults += "<div class='pushLeft'>Original value: &nbsp;<b>" + origMismatchedRowValue + "</b><br/></div>"; 
        modifiedResults += "<div class='pushLeft'>Modified value: &nbsp;<b>" + modMismatchedRowValue + "</b><br/></div>"; 
        modifiedResults += "</div>"; //end modified div 
       } 
      } 
      else 
      { 
       modifiedAttrs++; 
      } 


     } 
     else 
     { 
      if (removed < 1001) 
      { 
       //iterate removed data here 
       removed++; 
      } 
      else 
      { 
       removed++; 
      } 

     } 
    } 
    //****************************this determines which ones have been added ***************************/ 

    foreach (PolyLineZ modpolyLineZ in modifiedNFs) 
    { 
     if (!ignoredRecNo.Contains(modpolyLineZ.RecordNumber) && (!ignoredDBstrings.Contains(modpolyLineZ.dbString))) 
     { 
      //iterate added data here 
     } 

    } 

    foreach (PolyLineZ polylineZ in removedList) 
    { 
     //iterate removed data here 
    } 
    foreach (PolyLineZ polylineZ in movedList) 
    { 
     //iterate moved data here 

    } 

    result += "<div id='addedJump'></div>" + addedResult; 
    result += "<div id='moddedJump'></div>" + modifiedResults; 
    result += "<div id='removedJump'></div>" + removedResults; 
    result += "<div id='movedJump'></div>" + movedResults; 

}

回答

0

我從所有數據附加到一個字符串使用StringBuilder的切換...

花了我的程序下來從30分鐘到檢測到3200個記錄,平均減少到7秒。

更多信息請參見該問題:

String vs. StringBuilder