2012-12-23 75 views
0

我想將一個html文檔轉換爲c#對象。我有一個有序列表中的名稱示例列表,如下所示。我正在使用Html Agility Pack。HTML到C#對象,遞歸函數?

<ol> 
    <li>Heather</li> 
    <li>Channing</li> 
    <li>Briana</li> 
    <li>Amber</li> 
    <li>Sabrina</li> 
    <li>Jessica 
     <ol> 
      <li>Melody</li> 
      <li>Dakota</li> 
      <li>Sierra</li> 
      <li>Vandi</li> 
      <li>Crystal</li> 
      <li>Samantha</li> 
      <li>Autumn</li> 
      <li>Ruby</li> 
     </ol></li> 
    <li>Taylor</li> 
    <li>Tara</li> 
    <li>Tammy</li> 
    <li>Laura</li> 
    <li>Shelly</li> 
    <li>Shantelle</li> 
    <li>Bob and Alice 
     <ol> 
     <li>Courtney</li> 
     <li>Misty</li> 
     <li>Jenny</li> 
     <li>Christa</li> 
     <li>Mindy</li> 
     </ol></li> 
    <li>Noel</li> 
    <li>Shelby</li> 
</ol> 

這些是我創建的代表名稱列表的對象。即人和他們的孩子。

public class PeopleList { 
    public List<Person> People {get; set;} 
} 

public class Person { 
    public string Name {get; set;} 
    public PeopleList Children {get; set;} 
} 

我在想,要創建這些對象,遞歸函數將是最好的。任何人都可以提供有關如何將HTML轉換爲C#對象的任何想法?

Abu。

+0

只是使用XPath來獲取文檔的一部分,然後用遞歸函數返回嵌套列表 –

+0

你有什麼例子嗎?我一直在嘗試使用遞歸函數,但無法找出它 –

+0

你想如何填充它a)具有Person類型的子類的人員PeopleList或b)具有其PeopleList的人員列表? – Anthill

回答

2
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument(); 
doc.LoadHtml(html); 

var list = Recurse(doc.DocumentNode); 

List<Person> Recurse(HtmlAgilityPack.HtmlNode root) 
{ 
    var ol = root.Element("ol"); 
    if (ol == null) return null; 

    return ol.Elements("li") 
       .Select(li => new Person 
       { 
        Name = li.FirstChild.InnerText.Trim(), 
        Children = Recurse(li) 
       }) 
       .ToList(); 
} 
0

我將調查HTMLAgilityPack http://htmlagilitypack.codeplex.com/

我還沒有使用它的這個特別,但它的作品真的很好解析HTML。

+0

我已經使用HtmlAgilityPack了。試圖弄清楚如何將這個名單和孩子列表解析成C#對象。 –

0

爲了好玩 - 或者如果你真的想與他們PeopleList有人員PeopleList的列表:P - 你可以這樣做(無需HtmlAgilityPack爲您發佈的代碼) :

namespace StackFun 
{ 
    using System.Collections.Generic; 
    using System.Linq; 
    using System.Xml.Linq; 

    public class PeopleList 
    { 
     public List<Person> People { get; set; } 
    } 

    public class Person 
    { 
     public string Name { get; set; } 
     public PeopleList Children { get; set; } 
    } 

    class Program 
    { 
     static IEnumerable<PeopleList> GetChildren(PeopleList parent, IEnumerable<XElement> children) 
     { 
      parent.People = new List<Person>(); 
      foreach (var child in children) 
      { 
       var person = new Person 
       { 
        Name = ((XText)child.FirstNode).Value.Trim(new[] { ' ', '\r', '\n' }), 
       }; 
       parent.People.Add(person); 
       foreach (var childrenOf in child.Elements("ol").SelectMany(BuildFromXml)) 
       { 
        person.Children = childrenOf; 
       } 
      } 
      yield return parent; 

     } 

     static IEnumerable<PeopleList> BuildFromXml(XElement node) 
     { 
      return GetChildren(new PeopleList(), node.Elements("li")); 
     } 

     static void Main(string[] args) 
     { 
      const string xml = @"<ol> 
      <li>Heather</li> 
      <li>Channing</li> 
      <li>Briana</li> 
      <li>Amber</li> 
      <li>Sabrina</li> 
      <li>Jessica 
       <ol> 
        <li>Melody</li> 
        <li>Dakota</li> 
        <li>Sierra</li> 
        <li>Vandi</li> 
        <li>Crystal</li> 
        <li>Samantha</li> 
        <li>Autumn</li> 
        <li>Ruby</li> 
       </ol></li> 
      <li>Taylor</li> 
      <li>Tara</li> 
      <li>Tammy</li> 
      <li>Laura</li> 
      <li>Shelly</li> 
      <li>Shantelle</li> 
      <li>Bob and Alice 
       <ol> 
       <li>Courtney</li> 
       <li>Misty</li> 
       <li>Jenny</li> 
       <li>Christa</li> 
       <li>Mindy</li> 
       </ol></li> 
      <li>Noel</li> 
      <li>Shelby</li> 
     </ol>"; 

      var doc = XDocument.Parse(xml); 
      var listOfPeople = BuildFromXml(doc.Root).ToList(); 
     } 
    } 
} 

你可能想,雖然什麼(猜你沒有指定),是人民和他們的孩子的名單,你可以開始使用:

static IEnumerable<Person>Populate(IEnumerable<XElement> children) 
{ 
    foreach (var child in children) 
    { 
      var person = new Person 
      { 
       Name = ((XText)child.FirstNode).Value.Trim(new[] { ' ', '\r', '\n' }), 
       Children = new PeopleList() 

      }; 
      person.Children.People = new List<Person>(); 
      foreach (var childrenOf in child.Elements("ol").SelectMany(BuildFromXml)) 
      { 
       person.Children.People.Add(childrenOf); 
      } 
      yield return person; 
    } 

} 

static IEnumerable<Person> BuildFromXml(XElement node) 
{ 
    return Populate(node.Elements("li")); 
} 

如果你希望(或需要)使用HtmlAgilityPack的代碼可能看起來像:

class Program 
{ 
    static IEnumerable<Person> Populate(IEnumerable<HtmlNode> children) 
    { 
     foreach (var child in children) 
     { 
      var person = new Person 
      { 
       Name = child.InnerText.Split(new char[] { '\r', '\n' })[0].Trim(), 
       Children = new PeopleList() 

      }; 
      person.Children.People = new List<Person>(); 
      foreach (var childrenOf in child.Elements("ol").SelectMany(BuildFromHtml)) 
      { 
       person.Children.People.Add(childrenOf); 
      } 
      yield return person; 
     } 


    } 
    static IEnumerable<Person> BuildFromHtml(HtmlNode node) 
    { 
     return Populate(node.Elements("li")); 
    } 

    static void Main(string[] args) 
    { 
     const string html = @"<ol> 
      <li>Heather</li> 
      <li>Channing</li> 
      <li>Briana</li> 
      <li>Amber</li> 
      <li>Sabrina</li> 
      <li>Jessica 
       <ol> 
        <li>Melody</li> 
        <li>Dakota</li> 
        <li>Sierra</li> 
        <li>Vandi</li> 
        <li>Crystal</li> 
        <li>Samantha</li> 
        <li>Autumn</li> 
        <li>Ruby</li> 
       </ol></li> 
      <li>Taylor</li> 
      <li>Tara</li> 
      <li>Tammy</li> 
      <li>Laura</li> 
      <li>Shelly</li> 
      <li>Shantelle</li> 
      <li>Bob and Alice 
       <ol> 
       <li>Courtney</li> 
       <li>Misty</li> 
       <li>Jenny</li> 
       <li>Christa</li> 
       <li>Mindy</li> 
       </ol></li> 
      <li>Noel</li> 
      <li>Shelby</li> 
     </ol>"; 

     var doc = new HtmlDocument(); 
     doc.LoadHtml(html); 
     var listOfPeople = BuildFromHtml(doc.DocumentNode.FirstChild).ToList(); 
    } 
}