來自StreamReader的原始文件字節，幻數檢測

我試圖區分「文本文件」和「二進制」文件，因爲我實際上希望忽略具有「不可讀」內容的文件。來自StreamReader的原始文件字節，幻數檢測

我有一個我認爲是GZIP存檔的文件。我試圖通過檢測幻數/文件簽名來忽略這種文件。如果我用Notepad ++中的十六進制編輯器插件打開文件，我可以看到前三個十六進制代碼是1f 8b 08。

但是如果我使用StreamReader讀取這個文件，我不知道怎麼去原始字節..

using (var streamReader = new StreamReader(@"C:\file")) 
{ 
    char[] buffer = new char[10]; 
    streamReader.Read(buffer, 0, 10); 
    var s = new String(buffer); 

    byte[] bytes = new byte[6]; 
    System.Buffer.BlockCopy(s.ToCharArray(), 0, bytes, 0, 6); 
    var hex = BitConverter.ToString(bytes); 

    var otherhex = BitConverter.ToString(System.Text.Encoding.UTF8.GetBytes(s.ToCharArray())); 
}

在using語句，我有以下變量值的結尾：

hex: "1F-00-FD-FF-08-00" 
otherhex: "1F-EF-BF-BD-08-00-EF-BF-BD-EF-BF-BD-0A-51-02-03"

均未開始與在記事本中所示++的十六進制值。

是否可以通過StreamReader讀取文件的結果來獲取原始字節？

來源

2013-02-10 Tom Hunter

只是測試中的字節字節數組，你不需要所有的字符串的東西 – 2013-02-10 12:49:38

問題是（儘管上面的例子）我實際我從一個字符串開始（我知道它是由一個StreamReader生成的），我希望不必改變字符串的提供方式。 [這個答案]（http://stackoverflow.com/a/10380166/62072）似乎表明，它有可能從字符串中獲得原始字節..我錯過了什麼？ – 2013-02-10 12:57:23

1F你在你的十六進制閱讀器中看到的是31轉換爲49（x31）（'1'）和70（x46）'F'Char（x1f）是ascii中的美國字符（單位分隔符）不可打印像esc或鍾。所以如果你正在尋找字節後有效地轉換爲字符，你必須查找Char（x1f）Char（8B）char（8） – 2013-02-10 14:28:27

您的代碼嘗試將二進制緩衝區更改爲字符串。字符串是NET中的Unicode，因此需要兩個字節。如你所見，結果有點不可預測。

只需使用一個BinaryReader在其ReadBytes方法

using(FileStream fs = new FileStream(@"C:\file", FileMode.Open, FileAccess.Read)) 
{ 
    using (var reader = new BinaryReader(fs, new ASCIIEncoding())) 
    { 
     byte[] buffer = new byte[10]; 
     buffer = reader.ReadBytes(10); 
     if(buffer[0] == 31 && buffer[1] == 139 && buffer[2] == 8) 
      // you have a signature match.... 
    } 
}

來源

2013-02-10 12:36:33 Steve

你不行。 StreamReader用於讀取文本，而不是二進制文件。直接使用Stream來讀取字節。在你的情況FileStream。

要猜測文件是文本文件還是二進制文件，您可以將第一個4K讀入byte[]並解釋該文件。

順便說一句，你試圖強制字符字節。原則上這是無效的。我建議你自己熟悉Encoding是什麼：它是只有方式來在語義上正確的方式之間轉換字符和字節。

來源

2013-02-10 12:27:47 usr

使用情況（PDF文件）：

Assert.AreEqual("25504446", GetMagicNumbers(filePath, 4));

方法GetMagicNumbers：

private static string GetMagicNumbers(string filepath, int bytesCount) 
{ 
    // https://en.wikipedia.org/wiki/List_of_file_signatures 

    byte[] buffer; 
    using (var fs = new FileStream(filepath, FileMode.Open, FileAccess.Read)) 
    using (var reader = new BinaryReader(fs)) 
     buffer = reader.ReadBytes(bytesCount); 

    var hex = BitConverter.ToString(buffer); 
    return hex.Replace("-", String.Empty).ToLower(); 
}

來源

2015-10-14 12:28:58

來自StreamReader的原始文件字節，幻數檢測

回答

相關問題