在Java中解壓縮GZIPed HTTP響應

我試圖通過使用GZIPInputStream解壓縮GZIPed HTTP響應。但是我總是有相同的異常，當我嘗試讀取流：java.util.zip.ZipException: invalid bit length repeat在Java中解壓縮GZIPed HTTP響應

我的HTTP請求頭：

GET www.myurl.com HTTP/1.0\r\n 
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6\r\n 
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n 
Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3\r\n 
Accept-Encoding: gzip,deflate\r\n 
Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7\r\n 
Keep-Alive: 115\r\n 
Connection: keep-alive\r\n 
X-Requested-With: XMLHttpRequest\r\n 
Cookie: Some Cookies\r\n\r\n

在HTTP響應報頭的末尾，我得到path=/Content-Encoding: gzip，其次是gziped響應。

我試過2個同類者代碼解壓：

UPDATE：在下面的代碼，tBytes = (the string after 'path=/Content-Encoding: gzip').getBytes();

GZIPInputStream gzip = new GZIPInputStream (new ByteArrayInputStream (tBytes)); 

StringBuffer szBuffer = new StringBuffer(); 

byte tByte [] = new byte [1024]; 

while (true) 
{ 
    int iLength = gzip.read (tByte, 0, 1024); // <-- Error comes here 

    if (iLength < 0) 
     break; 

    szBuffer.append (new String (tByte, 0, iLength)); 
}

而這一次，我得到這個論壇上：

InputStream  gzipStream = new GZIPInputStream (new ByteArrayInputStream (tBytes)); 
Reader   decoder = new InputStreamReader (gzipStream, "UTF-8");//<- I tried ISO-8859-1 and get the same exception 
BufferedReader buffered = new BufferedReader (decoder);

我猜這是一個編碼錯誤。

最好的問候，

bill0ute

來源

2010-03-19 bill0ute

你不告訴你如何讓你使用tBytes在這裏設立gzip的流：

GZIPInputStream gzip = new GZIPInputStream (new ByteArrayInputStream (tBytes));

一種解釋是，你是包括tBytes中的整個HTTP響應。相反，它應該只是HTTP標頭之後的內容。

另一種解釋是，響應是chunked。

編輯：您是內容編碼線作爲郵件正文服藥後的數據。但是，根據HTTP 1.1規範，標題字段沒有特定的順序，所以這是非常危險的。

請求（第5部分）：

如HTTP specification的這一部分說明的那樣，請求或響應的消息主體不會第一空行後來的特定報頭字段之後，但和響應（第6部分）消息使用RFC 822 [9]的消息格式傳輸實體（消息的有效負載消息）的通用消息格式。這兩種類型的消息的由一個起始行，零個或多個報頭字段（也稱爲「報頭」），一個空行（即，一個線一無所有的CRLF前述）指示報頭的末尾字段，並可能是一個消息體。

你仍然沒有表現出如何完全組成tBytes，但是現在我認爲你錯誤地在你嘗試解壓縮的數據中包含空行。消息正文在空行的CRLF字符之後開始。

我可以建議您使用httpclient庫來提取郵件正文嗎？

來源

2010-03-19 01:01:36

嗨Wim。感謝您的回答。我更新了消息來解釋我如何獲得tBytes。我不認爲響應被分塊，因爲有一個Content-Length頭。但我不確定。 bill0ute – bill0ute 2010-03-19 01:23:27

嗨Wim。我正在嘗試使用HttpClient包，但找不到Java文檔。我只拿到了例子。你能給我一個連接到套接字併發送獲取請求的小例子嗎？謝謝 – bill0ute 2010-03-19 11:35:01

看看這個教程，它是獲取HTTP GET響應正文的一個簡單示例：http://hc.apache.org/httpclient-3.x/tutorial.html在你的情況下，你會想要像你現在處理'tBytes'一樣處理'responseBody'。 – 2010-03-19 11:57:19

那麼我在這裏可以看到這個問題;

int iLength = gzip.read (tByte, 0, 1024);

使用以下來修復該問題;

 byte[] buff = new byte[1024]; 
byte[] emptyBuff = new byte[1024]; 
          StringBuffer unGzipRes = new StringBuffer(); 

          int byteCount = 0; 
          while ((byteCount = gzip.read(buff, 0, 1024)) > 0) { 
           // only append the buff elements that 
           // contains data 
           unGzipRes.append(new String(Arrays.copyOf(
             buff, byteCount), "utf-8")); 

           // empty the buff for re-usability and 
           // prevent dirty data attached at the 
           // end of the buff 
           System.arraycopy(emptyBuff, 0, buff, 0, 
             1024); 
          }

來源

2014-01-09 11:05:15

在Java中解壓縮GZIPed HTTP響應

回答

相關問題