GZipStream - 即使使用flush也不寫所有壓縮數據？

我有一個討厭的問題，gzipstream瞄準.net 3.5。這是我第一次使用gzipstream，但是我已經模仿了很多教程，其中包括here，我仍然陷入困境。GZipStream - 即使使用flush也不寫所有壓縮數據？

我的應用程序將數據表序列化爲xml並插入數據庫，將壓縮數據存儲到varbinary（max）字段以及未壓縮緩衝區的原始長度。然後，當我需要它時，我檢索這些數據並解壓縮並重新創建數據表。解壓縮似乎是失敗的。

編輯：遺憾的是，在將GetBuffer更改爲ToArray之後，我的問題仍然存在。代碼更新如下

壓縮代碼：

DataTable dt = new DataTable("MyUnit"); 
//do stuff with dt 
//okay... now compress the table 
using (MemoryStream xmlstream = new MemoryStream()) 
{ 
    //instead of stream, use xmlwriter? 
    System.Xml.XmlWriterSettings settings = new System.Xml.XmlWriterSettings(); 
    settings.Encoding = Encoding.GetEncoding(1252); 
    settings.Indent = false; 
    System.Xml.XmlWriter writer = System.Xml.XmlWriter.Create(xmlstream, settings); 
    try 
    { 
     dt.WriteXml(writer); 
     writer.Flush(); 
    } 
    catch (ArgumentException) 
    { 
     //likely an encoding issue... okay, base64 encode it 
     var base64 = Convert.ToBase64String(xmlstream.ToArray()); 
     xmlstream.Write(Encoding.GetEncoding(1252).GetBytes(base64), 0, Encoding.GetEncoding(1252).GetBytes(base64).Length); 
    } 

    using (MemoryStream zipstream = new MemoryStream()) 
    { 
     GZipStream zip = new GZipStream(zipstream, CompressionMode.Compress); 
     log.DebugFormat("Compressing commands..."); 
     zip.Write(xmlstream.GetBuffer(), 0, xmlstream.ToArray().Length); 
     zip.Flush(); 
     float ratio = (float)zipstream.ToArray().Length/(float)xmlstream.ToArray().Length; 
     log.InfoFormat("Resulting compressed size is {0:P2} of original", ratio); 

     using (SqlCommand cmd = new SqlCommand()) 
     { 
      cmd.CommandText = "INSERT INTO tinydup (lastid, command, compressedlength) VALUES (@lastid,@compressed,@length)"; 
      cmd.Connection = db; 
      cmd.Parameters.Add("@lastid", SqlDbType.Int).Value = lastid; 
      cmd.Parameters.Add("@compressed", SqlDbType.VarBinary).Value = zipstream.ToArray(); 
      cmd.Parameters.Add("@length", SqlDbType.Int).Value = xmlstream.ToArray().Length; 
      cmd.ExecuteNonQuery(); 

     } 
    }

解壓縮代碼：

/* This is an encapsulation of what I get from the database 
public class DupUnit{ 
    public uint lastid; 
    public uint complength; 
    public byte[] compressed; 
}*/ 
    //I have already retrieved my list of work to do from the database in a List<Dupunit> dupunits 
foreach (DupUnit unit in dupunits) 
{ 
    DataSet ds = new DataSet(); 
    //DataTable dt = new DataTable(); 
    //uncompress and extract to original datatable 
    try 
    { 
     using (MemoryStream zipstream = new MemoryStream(unit.compressed)) 
     { 
      GZipStream zip = new GZipStream(zipstream, CompressionMode.Decompress); 
      byte[] xmlbits = new byte[unit.complength]; 
      //WHY ARE YOU ALWAYS 0!!!!!!!! 
      int bytesdecompressed = zip.Read(xmlbits, 0, unit.compressed.Length); 
      MemoryStream xmlstream = new MemoryStream(xmlbits); 
      log.DebugFormat("Uncompressed XML against {0} is: {1}", m_source.DSN, Encoding.GetEncoding(1252).GetString(xmlstream.ToArray())); 
      try{ 
       ds.ReadXml(xmlstream); 
      }catch(Exception) 
      { 
       //it may have been base64 encoded... decode first. 
       ds.ReadXml(Encoding.GetEncoding(1254).GetString(
       Convert.FromBase64String(
       Encoding.GetEncoding(1254).GetString(xmlstream.ToArray()))) 
       ); 
      } 
      xmlstream.Dispose(); 
     } 
    } 
    catch (Exception e) 
    { 
     log.Error(e); 
     Thread.Sleep(1000);//sleep a sec! 
     continue; 
    }

注bytesdecompressed以上...評論始終爲0。任何想法？我做錯了嗎？

編輯2：

因此，這是奇怪的。添加以下調試代碼的解壓縮程序：

GZipStream zip = new GZipStream(zipstream, CompressionMode.Decompress); 
    byte[] xmlbits = new byte[unit.complength]; 
    int offset = 0; 
    while (zip.CanRead && offset < xmlbits.Length) 
    { 
     while (zip.Read(xmlbits, offset, 1) == 0) ; 
     offset++; 
    }

調試時，有時是循環會完成，但其他時候，它會掛起。當我停止調試時，它將在1616字節的1600字節處出現。我會繼續，但它根本不會移動。

編輯3：該錯誤似乎在壓縮代碼。無論出於何種原因，它都不會保存所有的數據。當我嘗試使用第三方gzip機制解壓縮數據時，我只能獲得部分原始數據。

我開始賞金，但我真的沒有太多的聲譽，得到的現在:-(

來源

2014-07-01 longofest

終於找到了答案。壓縮數據並不完整，因爲GZipStream.Flush（）完全不能確保所有數據都不在緩衝區中 - 您需要使用GZipStream.Close（）作爲pointed out here。當然，如果你得到一個不好的壓縮，這一切都會下降 - 如果你嘗試解壓它，你將總是從Read（）返回0。

來源

2014-07-07 20:51:27 longofest

我會說這條線，至少，是最錯誤的：

cmd.Parameters.Add("@compressed", SqlDbType.VarBinary).Value = zipstream.GetBuffer();

MemoryStream.GetBuffer：

注意，緩衝器包含所分配，這可能是未使用的字節。例如，如果將字符串「測試」被寫入到MemoryStream對象，從返回的緩衝區的長度是256，而不是4，未使用252個字節。要僅獲取緩衝區中的數據，請使用ToArray方法。

應該指出的是，在zip格式，它首先作品定位到存儲在端文件的數據 - 所以，如果你已經存放超過了需要，所需的條目在更多的數據「文件結尾「不存在。

順便說一句，我也建議一個不同的名稱爲您compressedlength列 - 我最初採取了它（儘管你敘述）爲意圖存儲，以及對壓縮數據的長度（並寫了我的答案的一部分來解決這個問題）。也許originalLength會是一個更好的名字？

來源

2014-07-01 14:40:17

重大事件。我會做出調整，看看它是如何發展的。 – longofest

所以，這顯然是一個問題，但不是這個問題...現在用最新代碼更新原始問題，但仍然得到0解壓縮讀取。 – longofest

...但我有幾次，我用緩衝區，而不是toarray ...嗯...讓我工作更多... – longofest

GZipStream - 即使使用flush也不寫所有壓縮數據？

回答

相關問題