在c中的html數據中獲取href標籤＃

我正在使用web客戶端類來從網頁的HTML數據。現在我想從HTML數據中獲得完整的href標籤和標題。最初我使用循環，Felling效率低下，我切換到regExp，但dint得到了有效的解決方案。在c中的html數據中獲取href標籤＃

他的最初代碼：

for (int i = 0; i < htmldata.Length - 5; i++) 
{ 
    if (htmldata.Substring(i, 5) == "href=") 
    { 
    n1 = htmldata.Substring(i + 6, htmldata.Length - (i + 6)).IndexOf("\""); 
    Sublink = htmldata.Substring(i + 6, n1); 
    var absoluteUri = new Uri(baseUri, temp); 
    n2 = htmldata.Substring(i + n1 + 1, htmldata.Length - (i + n1 + 1)).IndexOf("<"); 
    subtitle = htmldata.Substring(i + 6 + n1 + 2, n2 - 7); 
    } 
}

此代碼得到一些像這樣的鏈接。

/l.href.replace(new RegExp(

/advanced_search?hl=en&q=&hl=en&

和標題這樣

onclick=gbar.qs(this) class=gb2>Photos 

")+"q="+encodeURIComponent(b)})}i.qs=n;function o(a,b,d,c,f,e){var g=document.getElementById(a);if(g){var

哪些是絕對無效。請爲我提供獲取有效的相關href鏈接和標題的正確代碼。

來源

2010-04-02 Gokul

使用HTML Agility pack解析HTML你，那麼你可以使用XPath表達式來選擇頁面和相關數據的所有鏈接。

試圖自己解析HTML是容易出錯和脆弱的，正如您已經發現的那樣。

來源

2010-04-02 09:18:10 Oded

RegEx match open tags except XHTML self-contained tags

來源

2010-04-02 09:06:44 wRAR

他沒有試圖用正則表達式解析。他正在使用子字符串和索引。 – Oded 2010-04-02 09:54:38

在c中的html數據中獲取href標籤＃

回答

相關問題