用HttpURLConnection慢速下載

我試圖做一個方法，下載一個網頁。首先，我創建一個HttpURLConnection。其次，我調用connect（）方法。三，我通過BufferedReader讀取數據。用HttpURLConnection慢速下載

問題是，有些頁面會得到合理的閱讀時間，但有些頁面非常慢（可能需要大約10分鐘！）。慢速頁面總是相同的，它們來自同一個網站。用瀏覽器打開這些頁面只需要幾秒鐘而不是10分鐘。下面是代碼

static private String getWebPage(PageNode pagenode) 
{ 
    String result; 
    String inputLine; 
    URI url; 
    int cicliLettura=0; 
    long startTime=0, endTime, openConnTime=0,connTime=0, readTime=0; 
    try 
    { 
     if(Core.logGetWebPage()) 
      startTime=System.nanoTime(); 
     result=""; 
     url=pagenode.getUri(); 
     if(Core.logGetWebPage()) 
      openConnTime=System.nanoTime(); 
     HttpURLConnection yc = (HttpURLConnection) url.toURL().openConnection(); 
     if(url.toURL().getProtocol().equalsIgnoreCase("https")) 
      yc=(HttpsURLConnection)yc; 
     yc.addRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB;  rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)"); 
     yc.connect(); 
     if(Core.logGetWebPage()) 
      connTime=System.nanoTime(); 
     BufferedReader in = new BufferedReader(new InputStreamReader(yc.getInputStream())); 

     while ((inputLine = in.readLine()) != null) 
     { 
      result=result+inputLine+"\n"; 
      cicliLettura++; 
     } 
     if(Core.logGetWebPage()) 
      readTime=System.nanoTime(); 
     in.close(); 
     yc.disconnect(); 
     if(Core.logGetWebPage()) 
     { 
      endTime=System.nanoTime(); 
      System.out.println(/*result+*/"getWebPage eseguito in "+(endTime-startTime)/1000000+" ms. Size: "+result.length()+" Response Code="+yc.getResponseCode()+" Protocollo="+url.toURL().getProtocol()+" openConnTime: "+(openConnTime-startTime)/1000000+" connTime:"+(connTime-openConnTime)/1000000+" readTime:"+(readTime-connTime)/1000000+" cicliLettura="+cicliLettura); 
     } 
     return result; 
    }catch(IOException e){ 
     System.out.println("Eccezione: "+e.toString()); 
     e.printStackTrace(); 
     return null; 
    } 
}

這裏有兩個日誌樣本一個「正常」的網頁 getWebPage執行尺寸：48261響應代碼= 200協議= HTTP openConnTime：0 connTime：1 readTime：569 cicliLettura = 359

一個http://ricette.giallozafferano.it/Pan-di-spagna-al-cacao.html/allcomments 「慢」的網頁的看起來像這樣 getWebPage執行尺寸：1748261二維碼= 200協議= HTTP openConnTime：0 connTime：1 readTime：596834 cicliLettura = 35685

來源

2014-05-14 mark

你可能什麼小號在這裏結束是您整理方式的結果result。請記住，Java中的String是不可變的 - 因此，當發生字符串連接時，必須實例化一個新的String，這通常涉及複製該String中包含的所有數據。您有以下代碼爲每個行執行：

result=result+inputLine+"\n";

在幕後，這條線包括：

一個新的StringBuffer與的result的全部內容爲止
inputLine附加創建到StringBuffer
將StringBuffer轉換爲String
新的StringBuffer被該String
甲換行符追加到StringBuffer
的StringBuffer創建被轉換爲String
即String被存儲爲result。

該操作將變得越來越耗時爲result變得越來越大 - （！雖然從2的樣品）和你的結果似乎表明，結果頁面大小急劇增加。

相反，使用StringBuffer直接。

StringBuffer buffer = new StringBuffer(); 
while ((inputLine = in.readLine()) != null) 
{ 
    buffer.append(inputLine).append('\n'); 
    cicliLettura++; 
} 
String result = buffer.toString();

來源

2014-05-14 15:52:51

只是偉大的，非常親的答案。你能爲此發送一份解釋性文件嗎？這是該頁面的新輸出：getWebPage大小：1709466響應代碼= 200 Protocollo = http openConnTime：0 connTime：0 readTime：2257 cicliLettura = 35686 – mark

用HttpURLConnection慢速下載

回答

相關問題