0
我必須得到一個學校項目的〜1000個網站的源代碼。我在for循環中使用HTTP Webrequest。但是,我的列表中超過一半的網站返回404錯誤,因此無法找到網站。當我在Chrome,Firefox或Internet Explorer瀏覽本網站時,一切正常。[C#]獲取網站的源代碼(404錯誤)
我的繼承人代碼來獲取源代碼:
public string getSource(string url){
string urlAddress = url;
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK)
{
Stream receiveStream = response.GetResponseStream();
StreamReader readStream = null;
if (response.CharacterSet == null)
{
readStream = new StreamReader(receiveStream);
}
else
{
readStream = new StreamReader(receiveStream, Encoding.GetEncoding(response.CharacterSet));
}
data = readStream.ReadToEnd();
response.Close();
readStream.Close();
}
return data;
}
也許它不會因爲1000個網站質量的作品?
也許你應該給我們一些成功的和一些失敗的網址檢出。 – Kell 2014-11-24 16:19:27