2011-06-08 88 views
14
<html> 
    <body> 
     <div class="main"> 
      <div class="submain"><h2></h2><p></p><ul></ul> 
      </div> 
      <div class="submain"><h2></h2><p></p><ul></ul> 
      </div> 
     </div> 
    </body> 
</html> 

我加載HTML到一個HtmlDocument。然後我選擇XPath作爲submain。然後我不知道如何分別訪問每個標籤,即h2p如何從節點訪問子節點在htmlagility包

HtmlAgilityPack.HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//div[@class=\"submain\"]"); 
foreach (HtmlAgilityPack.HtmlNode node in nodes) {} 

如果我使用node.InnerText我得到的所有文字和InnerHtml也是沒有用的。如何選擇單獨的標籤?

回答

25

下面將幫助:

HtmlAgilityPack.HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//div[@class=\"submain\"]"); 
foreach (HtmlAgilityPack.HtmlNode node in nodes) { 
    //Do you say you want to access to <h2>, <p> here? 
    //You can do: 
    HtmlNode h2Node = node.SelectSingleNode("./h2"); //That will get the first <h2> node 
    HtmlNode allH2Nodes= node.SelectNodes(".//h2"); //That will search in depth too 

    //And you can also take a look at the children, without using XPath (like in a tree):   
    HtmlNode h2Node = node.ChildNodes["h2"]; 
} 
2

從內存中,我認爲每個Node都有自己的ChildNodes收藏,讓你的for…each塊內,你應該能夠檢查node.ChildNodes

0

您正在尋找後人

var firstSubmainNodeName = doc 
    .DocumentNode 
    .Descendants() 
    .Where(n => n.Attributes["class"].Value == "submain") 
    .First() 
    .InnerText;