無法獲取URL內容爲UTF-8

我試圖從URL中讀取內容，但不會返回，而不是「E」奇怪的符號，「A」等無法獲取URL內容爲UTF-8

這是我是代碼使用：

public static String getPageContent(String _url) { 
    URL url; 
    InputStream is = null; 
    BufferedReader dis; 
    String line; 
    String text = ""; 
    try { 
     url = new URL(_url); 
     is = url.openStream(); 

     //This line should open the stream as UTF-8 
     dis = new BufferedReader(new InputStreamReader(is, "UTF-8")); 

     while ((line = dis.readLine()) != null) { 
      text += line + "\n"; 
     } 
    } catch (MalformedURLException mue) { 
     mue.printStackTrace(); 
    } catch (IOException ioe) { 
     ioe.printStackTrace(); 
    } finally { 
     try { 
      is.close(); 
     } catch (IOException ioe) { 
      // nothing to see here 
     } 
    } 
    return text; 
}

我見過這樣的其他問題，和所有的人都回答一樣

Declare your inputstream as 
new InputStreamReader(is, "UTF-8")

但我不能得到它的工作。

例如，如果我的網址內容包含

è uno dei più

我得到

Ã¨ uno dei piÃ¹

我失去了什麼？

來源

2013-03-19 BackSlash

我真的不知道爲什麼這不應該工作，但是Java 7的方式是使用StandardCharsets.UTF_8看到

http://docs.oracle.com/javase/7/docs/api/java/nio/charset/StandardCharsets.html

在（新）構造的InputStreamReader（InputStream中的，字符集cs），見

http://docs.oracle.com/javase/7/docs/api/java/io/InputStreamReader.html。

來源

2013-03-19 18:48:38 uberwach

以你爲例來判斷。你確實收到了一個多字節的UTF-8字節流，但你的文本編輯器讀入爲ISO-8859-1。告訴您的編輯器以UTF-8格式讀取字節！

來源

2013-03-19 18:58:36

無法獲取URL內容爲UTF-8

回答

相關問題