我是使用.NET的WebRequest作爲臨時黑客「屏幕抓取」自己的頁面。.NET WebRequest/WebResponse可以正確轉換重音標記,變音標記和實體嗎?
這很好,但重音字符和變音字符不能正確翻譯。
我想知道是否有一種方法可以使用.NET的許多內置屬性和方法正確轉換它們。
這裏是我用搶的頁面代碼:
private string getArticle(string urlToGet)
{
StreamReader oSR = null;
//Here's the work horse of what we're doing, the WebRequest object
//fetches the URL
WebRequest objRequest = WebRequest.Create(urlToGet);
//The WebResponse object gets the Request's response (the HTML)
WebResponse objResponse = objRequest.GetResponse();
//Now dump the contents of our HTML in the Response object to a
//Stream reader
oSR = new StreamReader(objResponse.GetResponseStream());
//And dump the StreamReader into a string...
string strContent = oSR.ReadToEnd();
//Here we set up our Regular expression to snatch what's between the
//BEGIN and END
Regex regex = new Regex("<!-- content_starts_here //-->((.|\n)*?)<!-- content_ends_here //-->",
RegexOptions.IgnoreCase);
//Here we apply our regular expression to our string using the
//Match object.
Match oM = regex.Match(strContent);
//Bam! We return the value from our Match, and we're in business.
return oM.Value;
}
對於與問題完全無關的事情發表評論感到抱歉,但是您使用太多評論。認真。 – Chris 2009-04-29 23:29:31