如何捕捉與C＃

我使用C＃得到一個HTML頁面的特定HTML類然而整體喜歡隔離只是一個DIV指定如何捕捉與C＃

<div class="row row-dia-obituario">

我使用這個代碼來獲取HTML，它帶來了頁面的完整html

request = (HttpWebRequest)WebRequest.Create("https://pt.wikipedia.org/wiki/Wikip%C3%A9dia:P%C3%A1gina_principal"); 
request.Proxy = webProxy; 
request.Timeout = 20000; 
request.Method = "GET"; 
request.KeepAlive = true; 
response = (HttpWebResponse)request.GetResponse(); 
sr = new StreamReader(response.GetResponseStream(), encoding); 
html = sr.ReadToEnd(); 
string htmlaux = Regex.Replace(html, "&quot;", "").Trim(); 
html = System.Net.WebUtility.HtmlDecode(htmlaux);

來源

2016-11-01 Felipe Oliveira

你是什麼意思「趕上與C＃特定HTML類」嗎？ – SeM

研究這個主題引導你使用[regex]是很奇怪的（http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/）.. 。您可能需要再次搜索並使用HtmlAgilityPAck或任何其他實際解析器查找答案。 –

當然，它帶來了完整的HTML內容，你甚至沒有過濾。順便說一句，正則表達式肯定是[不要走的路]（http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags） –

不要使用正則表達式來解析html。使用HTML解析器，你可以看看到HTML敏捷性包

HtmlDocument doc = new HtmlDocument(); 
    doc.LoadHtml(html); 

    var divNode = doc.DocumentNode.Descendants().Where(x => x.Name == "div" && 
               x.Attributes["class"].Value == "row row-dia-obituario") 
               .FirstOrDefault();

來源

2016-11-01 12:50:55 mybirthname

如何捕捉與C＃

回答

相關問題