2012-07-17 105 views
3

我下載了一個網頁如下。我想將它保存爲UTF-8文本。但是如何?下載一個網頁並保存爲UTF-8文本文件

HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url); 
using (HttpWebResponse resp = (HttpWebResponse)req.GetResponse()) 
{ 
    Encoding enc = Encoding.GetEncoding(resp.CharacterSet); 
    Encoding utf8 = Encoding.UTF8; 
    using (StreamWriter w = new StreamWriter(new FileStream(pathname, FileMode.Create), utf8)) 
    { 
     using (StreamReader r = new StreamReader(resp.GetResponseStream())) 
     { 
      // This works, but it's bad because you read the whole response into memory: 
      string s = r.ReadToEnd(); 
      w.Write(s); 

      // This doesn't work :(
      char[] buffer = new char[1024]; 
      int n; 
      while (!r.EndOfStream) 
      { 
       n = r.ReadBlock(buffer, 0, 1024); 
       w.Write(utf8.GetChars(Encoding.Convert(enc, utf8, enc.GetBytes(buffer)))); 
      } 

      // This means that r.ReadToEnd() is doing the transcoding to UTF-8 differently. 
      // But how?! 
     } 
    } 
    return resp.StatusCode; 
} 

請不要閱讀本段。只是在這裏發出有關代碼過多​​的警告消息。

+0

看看這個[這裏](http://stackoverflow.com/q/8342115/1529584) – 2012-07-17 18:03:33

+0

呃..和什麼是你的代碼錯誤?它在哪些條件下行爲不當? – 2012-07-17 18:14:40

+1

你是否需要*編碼玩弄?在包含自己的編碼之前你有問題嗎?我只問,因爲你實際上需要用編碼做很多事情非常罕見,大部分框架本身處理得很好。 – Smudge202 2012-07-17 18:14:53

回答

7

您可以簡單地使用WebClient類。它支持的編碼和更容易使用:

WebClient webClient = new WebClient(); 
webClient.Encoding = System.Text.Encoding.UTF8; 
webClient.DownloadFile(url, "file.txt");