2011-06-29 36 views
11

將二進制文件中的字節序列替換爲相同長度的其他字節的最佳方法是什麼?二進制文件將非常大,大約50 MB,不應該一次加載到內存中。替換二進制文件中的字節序列

更新:我不知道需要替換的字節的位置,我需要先找到它們。

+0

打開文件,移動指針的老字節的位置,寫入新的字節。 –

+0

您如何知道要修改的字節的確切位置?這是一個修復抵消? –

回答

14

假設您試圖替換文件的已知段

  • 開放具有讀/寫訪問的FileStream
  • 尋求到合適的位置
  • 覆蓋現有數據

示例代碼來了...

public static void ReplaceData(string filename, int position, byte[] data) 
{ 
    using (Stream stream = File.Open(filename, FileMode.Open)) 
    { 
     stream.Position = position; 
     stream.Write(data, 0, data.Length); 
    } 
} 

如果」有效地嘗試執行string.Replace的二進制版本(例如「總是替換字節{51,20,34 }與{20,35,15}}則相當困難。作爲一個什麼樣的簡短描述,你會怎麼做:

  • 分配你感興趣
  • 反覆讀入緩衝區,掃描數據的數據至少大小的緩衝區
  • 如果您找到一個匹配,尋求回到正確的位置(如stream.Position -= buffer.Length - indexWithinBuffer;和覆蓋數據

聽起來很簡單那麼遠,但有點棘手是,如果開始附近的緩衝區結尾的數據。你需要要記住所有的潛在比賽以及目前爲止您的比賽有多遠,所以如果您在閱讀下一個緩衝區值時獲得一個匹配項,您可以檢測到它。

有可能避免這種trickiness的方式,但我不喜歡嘗試拿出他們的副手:)

編輯:好的,我有一個想法,這可能有助於...

  • 保持一個緩衝區,它至少兩倍大,因爲你需要
  • 反覆:
    • 複製一半的緩衝到上半年
    • 從文件填充緩衝區下半年
    • 搜索整個整個緩衝區中的數據,你正在尋找

這樣,在某些時候,如果數據目前,它將完全在緩衝區內。

爲了回到正確的位置,您需要小心流的位置,但我認爲這應該起作用。這將是棘手的,如果你試圖找到所有比賽,但至少第一場比賽應該是相當簡單的...

+0

我忘了在我原來的問題中指出我不知道替換字節的位置。 – Tomas

+0

@Tomas:那麼,我的最後一句話是否合理地描述了你在做什麼? –

+0

@Jon,對不起,我忘了說明。我們應該在文件中找到第一個字節序列,然後替換它們。我知道這很難做,我沒有任何運氣Google搜索。所以發佈問題在這裏。 – Tomas

5

我的解決辦法:

/// <summary> 
    /// Copy data from a file to an other, replacing search term, ignoring case. 
    /// </summary> 
    /// <param name="originalFile"></param> 
    /// <param name="outputFile"></param> 
    /// <param name="searchTerm"></param> 
    /// <param name="replaceTerm"></param> 
    private static void ReplaceTextInBinaryFile(string originalFile, string outputFile, string searchTerm, string replaceTerm) 
    { 
     byte b; 
     //UpperCase bytes to search 
     byte[] searchBytes = Encoding.UTF8.GetBytes(searchTerm.ToUpper()); 
     //LowerCase bytes to search 
     byte[] searchBytesLower = Encoding.UTF8.GetBytes(searchTerm.ToLower()); 
     //Temporary bytes during found loop 
     byte[] bytesToAdd = new byte[searchBytes.Length]; 
     //Search length 
     int searchBytesLength = searchBytes.Length; 
     //First Upper char 
     byte searchByte0 = searchBytes[0]; 
     //First Lower char 
     byte searchByte0Lower = searchBytesLower[0]; 
     //Replace with bytes 
     byte[] replaceBytes = Encoding.UTF8.GetBytes(replaceTerm); 
     int counter = 0; 
     using (FileStream inputStream = File.OpenRead(originalFile)) { 
      //input length 
      long srcLength = inputStream.Length; 
      using (BinaryReader inputReader = new BinaryReader(inputStream)) { 
       using (FileStream outputStream = File.OpenWrite(outputFile)) { 
        using (BinaryWriter outputWriter = new BinaryWriter(outputStream)) { 
         for (int nSrc = 0; nSrc < srcLength; ++nSrc) 
          //first byte 
          if ((b = inputReader.ReadByte()) == searchByte0 
           || b == searchByte0Lower) { 
           bytesToAdd[0] = b; 
           int nSearch = 1; 
           //next bytes 
           for (; nSearch < searchBytesLength; ++nSearch) 
            //get byte, save it and test 
            if ((b = bytesToAdd[nSearch] = inputReader.ReadByte()) != searchBytes[nSearch] 
             && b != searchBytesLower[nSearch]) { 
             break;//fail 
            } 
            //Avoid overflow. No need, in my case, because no chance to see searchTerm at the end. 
            //else if (nSrc + nSearch >= srcLength) 
            // break; 

           if (nSearch == searchBytesLength) { 
            //success 
            ++counter; 
            outputWriter.Write(replaceBytes); 
            nSrc += nSearch - 1; 
           } 
           else { 
            //failed, add saved bytes 
            outputWriter.Write(bytesToAdd, 0, nSearch + 1); 
            nSrc += nSearch; 
           } 
          } 
          else 
           outputWriter.Write(b); 
        } 
       } 
      } 
     } 
     Console.WriteLine("ReplaceTextInBinaryFile.counter = " + counter); 
    } 
+1

沒有真正的工作 –

+0

爲我工作很好。 – quilkin

3

你可以用我BinaryUtility到搜索和替換一個或多個字節,而不將整個文件加載到內存中是這樣的:

var searchAndReplace = new List<Tuple<byte[], byte[]>>() 
{ 
    Tuple.Create(
     BitConverter.GetBytes((UInt32)0xDEADBEEF), 
     BitConverter.GetBytes((UInt32)0x)), 
    Tuple.Create(
     BitConverter.GetBytes((UInt32)0xAABBCCDD), 
     BitConverter.GetBytes((UInt16)0xAFFE)), 
}; 
using(var reader = 
    new BinaryReader(new FileStream(@"C:\temp\data.bin", FileMode.Open))) 
{ 
    using(var writer = 
     new BinaryWriter(new FileStream(@"C:\temp\result.bin", FileMode.Create))) 
    { 
     BinaryUtility.Replace(reader, writer, searchAndReplace); 
    } 
} 

BinaryUtilityç頌:在寫模式下

using System; 
using System.Collections.Generic; 
using System.IO; 
using System.Linq; 

public static class BinaryUtility 
{ 
    public static IEnumerable<byte> GetByteStream(BinaryReader reader) 
    { 
     const int bufferSize = 1024; 
     byte[] buffer; 
     do 
     { 
      buffer = reader.ReadBytes(bufferSize); 
      foreach (var d in buffer) { yield return d; } 
     } while (bufferSize == buffer.Length); 
    } 

    public static void Replace(BinaryReader reader, BinaryWriter writer, IEnumerable<Tuple<byte[], byte[]>> searchAndReplace) 
    { 
     foreach (byte d in Replace(GetByteStream(reader), searchAndReplace)) { writer.Write(d); } 
    } 

    public static IEnumerable<byte> Replace(IEnumerable<byte> source, IEnumerable<Tuple<byte[], byte[]>> searchAndReplace) 
    { 
     foreach (var s in searchAndReplace) 
     { 
      source = Replace(source, s.Item1, s.Item2); 
     } 
     return source; 
    } 

    public static IEnumerable<byte> Replace(IEnumerable<byte> input, IEnumerable<byte> from, IEnumerable<byte> to) 
    { 
     var fromEnumerator = from.GetEnumerator(); 
     fromEnumerator.MoveNext(); 
     int match = 0; 
     foreach (var data in input) 
     { 
      if (data == fromEnumerator.Current) 
      { 
       match++; 
       if (fromEnumerator.MoveNext()) { continue; } 
       foreach (byte d in to) { yield return d; } 
       match = 0; 
       fromEnumerator.Reset(); 
       fromEnumerator.MoveNext(); 
       continue; 
      } 
      if (0 != match) 
      { 
       foreach (byte d in from.Take(match)) { yield return d; } 
       match = 0; 
       fromEnumerator.Reset(); 
       fromEnumerator.MoveNext(); 
      } 
      yield return data; 
     } 
     if (0 != match) 
     { 
      foreach (byte d in from.Take(match)) { yield return d; } 
     } 
    } 
} 
0
public static void BinaryReplace(string sourceFile, byte[] sourceSeq, string targetFile, byte[] targetSeq) 
    { 
     FileStream sourceStream = File.OpenRead(sourceFile); 
     FileStream targetStream = File.Create(targetFile); 

     try 
     { 
      int b; 
      long foundSeqOffset = -1; 
      int searchByteCursor = 0; 

      while ((b=sourceStream.ReadByte()) != -1) 
      { 
       if (sourceSeq[searchByteCursor] == b) 
       { 
        if (searchByteCursor == sourceSeq.Length - 1) 
        { 
         targetStream.Write(targetSeq, 0, targetSeq.Length); 
         searchByteCursor = 0; 
         foundSeqOffset = -1; 
        } 
        else 
        { 
         if (searchByteCursor == 0) 
         { 
          foundSeqOffset = sourceStream.Position - 1; 
         } 

         ++searchByteCursor; 
        } 
       } 
       else 
       { 
        if (searchByteCursor == 0) 
        { 
         targetStream.WriteByte((byte) b); 
        } 
        else 
        { 
         targetStream.WriteByte(sourceSeq[0]); 
         sourceStream.Position = foundSeqOffset + 1; 
         searchByteCursor = 0; 
         foundSeqOffset = -1; 
        } 
       } 
      } 
     } 
     finally 
     { 
      sourceStream.Dispose(); 
      targetStream.Dispose(); 
     } 
    }