使用LINQ解析HTML

我需要幫助解析HTML文件。我是新的C＃和LINQ和一切我試圖一直沒有提取「鏈接」全成和「名稱1」使用LINQ解析HTML

 <tr class="Row"> 
       <td width="80"> 
       <div align="left"> <a href="link">details</a> 
       </div> 
       </td> 
       <td width="152">Name 1</td> 
       <td width="151">Name 2</td> 
       <td width="152">Name 3</td> 
       <td width="151">Name 4</td> 
       <td width="151">Name 5</td> 
       <td width="152">Name 6</td> 
     </tr> 

     <tr class="Row"> 
       <td width="80"> 
       <div align="left"> <a href="link">details</a> 
       </div> 
       </td> 
       <td width="152">Name 1</td> 
       <td width="151">Name 2</td> 
       <td width="152">Name 3</td> 
       <td width="151">Name 4</td> 
       <td width="151">Name 5</td> 
       <td width="152">Name 6</td> 
     </tr>

這是我的嘗試：

   var links = htmlDoc.DocumentNode.Descendants() 
        .Where(n => n.Name == "tr") 
        .Where(x => x.Attributes["class"] != null && x.Attributes["class"].Value == "Row") 
        .Select(x => x.Descendants() 
        .Where(s => s.Name == "href")); 

       foreach (var link in links) 
       { 
        Debug.WriteLine(link); 
       }

來源

2015-02-08 Macaret

您是否使用Html Agility Pack？ – igorushi 2015-02-08 10:36:54

我使用HtmlAgilityPack-PCL – Macaret 2015-02-08 10:38:02

檢查答案，並告訴我，如果有什麼不清楚 – mybirthname 2015-02-08 10:42:18

var nodes= htmlDoc.DocumentNode.Descendants() 
        .Where(n => n.Name == "a" || 
(n.Name == "td" && n.Attribute["width"] != null && n.Attribute["width"].Value != "80" && n.Parent.Name == "tr" && n.Parent.Attribute["class"] != null && n.Parent.Attribute["class"].Value = "Row")); 


       foreach (var node in nodes) 
       { 
        if(node.Attribute["href"] != null) 
        { 
         Debug.WriteLine(node.Attribute["href"].Value); 
        } 
        else 
        { 
         Debug.WriteLine(node.InnerText); 
        } 
       }

你需要這樣的東西。您正在使用名稱爲a或每個節點td的節點，其寬度不是80並且tr父節點具有class="Row"

來源

2015-02-08 10:38:29 mybirthname

感謝您的幫助 – Macaret 2015-02-08 11:43:53

歡迎您 – mybirthname 2015-02-08 16:31:42

您的linq不反映html的結構。它可以簡單地使用xpath來實現。

var links = htmlDoc.DocumentElement 
    .SelectNodes("//tr[class='Row']/td/div/a") 
    .Select(aElem=>aElem.Attributes["href"].Value)

來源

2015-02-08 10:43:55 igorushi

我無法使用Xpath，因爲我的項目是Windows Phone 8.1 – Macaret 2015-02-08 10:53:15

使用LINQ解析HTML

回答

相關問題