2016-09-29 36 views
1

是否可以說網絡請求只從網站獲取基於文本的數據?如果這是我該怎麼做?禁用HttpWebRequest的圖像下載

我能想象的唯一事情就是在響應字符串中搜索並移除所有圖像標籤。但是,這是一個非常糟糕的方式做到這一點...

編輯:這是我的代碼片段:

  string baseUrl = kvPair.Value[0]; 
      string loginUrl = kvPair.Value[1]; 
      string notifyUrl = kvPair.Value[2]; 
      cc = new CookieContainer(); 
      string loginDetails = DataCollector.GetLoginDetails(baseUrl, ref cc); 
      HttpWebRequest request = (HttpWebRequest)WebRequest.Create(loginUrl); 
      request.Method = "POST"; 
      request.Accept = "text/*"; 
      request.ContentType = "application/x-www-form-urlencoded; charset=UTF-8"; 
      request.CookieContainer = cc; 
      request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36"; 
      Byte[] data = Encoding.ASCII.GetBytes(loginDetails); 
      request.ContentLength = data.Length; 
      using (Stream s = request.GetRequestStream()) 
      { 
       s.Write(data, 0, data.Length); 
      } 
      HttpWebResponse res = (HttpWebResponse)request.GetResponse(); 
      request = (HttpWebRequest)WebRequest.Create(notifyUrl); 
      request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36"; 
      request.CookieContainer = cc; 
      res = (HttpWebResponse)request.GetResponse(); 
      Stream streamResponse = res.GetResponseStream(); 
      using (StreamReader sr = new StreamReader(streamResponse)) 
      { 
       ViewData["data"] += "<div style=\"float: left; margin-bottom: 50px;\">" + sr.ReadToEnd() + "</div>"; 
      } 
+1

嘗試設置accept頭只text/html的 – Crowcoder

+1

嘗試設置[接受](HTTPS: //msdn.microsoft.com/en-us/library/system.net.httpwebrequest.accept(v=vs.110).aspx)屬性。 –

+0

@Crowcoder沒有沒有工作 – Snickbrack

回答

0

我發現自己良好的編碼解決方案:

public static string StripImages(string input) 
{ 
    return Regex.Replace(input, "<img.*?>", String.Empty); 
} 

這個殺死所有圖像,但只有當你已經加載所有的圖像,所以沒有在此解決方案中傳輸數據的節省...

0

HTTP/1.1 Header Field Definitions'部分14.1包含接受標題定義。它指出以下內容:

...如果Accept頭字段存在,並且服務器無法根據組合的Accept字段值發送可接受的響應,那麼服務器應該發送406(不是可接受)的迴應。

因此,如果服務器尊重客戶端的請求,則由服務器決定。

我發現大多數服務器忽略Accept標題。到目前爲止,我只找到一個exceptoin:它是GitHub。我請求帶有音頻的GitHub主頁作爲Accept參數。它響應代碼爲妥善應對406

試用演示下面的代碼片段,你應該得到System.Net.WebException: The remote server returned an error: (406) Not Acceptable.

HttpWebRequest request = (HttpWebRequest) WebRequest.Create("https://github.com/"); 
request.Method = "GET"; 
request.Accept = "audio/*"; 

var response = request.GetResponse();