2013-03-19 186 views
0

我試圖從URL中讀取內容,但不會返回,而不是「E」奇怪的符號,「A」等無法獲取URL內容爲UTF-8

這是我是代碼使用:

public static String getPageContent(String _url) { 
    URL url; 
    InputStream is = null; 
    BufferedReader dis; 
    String line; 
    String text = ""; 
    try { 
     url = new URL(_url); 
     is = url.openStream(); 

     //This line should open the stream as UTF-8 
     dis = new BufferedReader(new InputStreamReader(is, "UTF-8")); 

     while ((line = dis.readLine()) != null) { 
      text += line + "\n"; 
     } 
    } catch (MalformedURLException mue) { 
     mue.printStackTrace(); 
    } catch (IOException ioe) { 
     ioe.printStackTrace(); 
    } finally { 
     try { 
      is.close(); 
     } catch (IOException ioe) { 
      // nothing to see here 
     } 
    } 
    return text; 
} 

我見過這樣的其他問題,和所有的人都回答一樣

Declare your inputstream as 
new InputStreamReader(is, "UTF-8") 

但我不能得到它的工作。

例如,如果我的網址內容包含

è uno dei più 

我得到

è uno dei più 

我失去了什麼?

回答

1

以你爲例來判斷。你確實收到了一個多字節的UTF-8字節流,但你的文本編輯器讀入爲ISO-8859-1。告訴您的編輯器以UTF-8格式讀取字節!