2016-12-15 112 views
0

我必須按如下方式解析文檔。我正在嘗試HtmlAgilityPack,但它非常複雜。我需要這個標籤內的文字:<td style="background: #36461f;color: #ffffff;font-weight: bold;padding: 2px;font-size: 12px;height: 25px;">Mac Bahsi</td>和兒童如何使用c解析html文檔#

<div class="div1" style="width: 288px;" onclick="parent.javaScriptAddSlip('slip', '164518117;;;-;11.25;1;Maç Bahsi;164518117')"> 
<div class="div1" style="width: 288px;" onclick="parent.javaScriptAddSlip('slip', '164518117;;;-;6.50;0;Maç Bahsi;164518117')">, 
<div class="div1" style="width: 288px;" onclick="parent.javaScriptAddSlip('slip', '164518117;;;-;1.18;2;Maç Bahsi;164518117')"> 

<!DOCTYPE HTML> 
 
<html xmlns="http://www.w3.org/1999/xhtml"> 
 
<head> 
 
    <style> 
 
     .table1 { 
 
      width: 100%; 
 
      margin: 0px; 
 
      padding: 0px; 
 
      border-collapse: collapse; 
 
      padding: 0px; 
 
     } 
 

 
     .div1 { 
 
      cursor: pointer; 
 
      margin: 1px; 
 
      border: 1px solid #999999; 
 
      float: left; 
 
      font-size: 12px; 
 
     } 
 

 
     .td1 { 
 
      text-align: center; 
 
      font-size: 20px; 
 
      font-weight: bold; 
 
      color: #33460E; 
 
      height: 20px; 
 
      padding: 0px; 
 
     } 
 

 
     .td2 { 
 
      text-align: center; 
 
      font-weight: bold; 
 
      color: #808000; 
 
      padding: 0px; 
 
     } 
 
    </style> 
 
</head> 
 
<body style="background: #FFFFCC;margin: 0px;padding: 0px;font-size: 12px;"> 
 
    <p></p> 
 
    <table style="width: 100%" cellpadding="0" cellspacing="0"> 
 
     <tr> 
 
      <td style="background: #36461f;color: #ffffff;font-weight: bold;padding: 2px;font-size: 12px;height: 25px;">Mac Bahsi</td> 
 
     </tr> 
 
     <tr> 
 
      <td> 
 
       <div class="div1" style="width: 288px;" onclick="parent.javaScriptAddSlip('slip', '164518117;;;-;11.25;1;Maç Bahsi;164518117')"> 
 
        <table class="table1"> 
 
         <tr class="menuClickable"> 
 
          <td class="td1">11.25</td> 
 
         </tr> 
 
         <tr class="menuClickable"> 
 
          <td class="td2">Club America Mexico</td> 
 
         </tr> 
 
        </table> 
 
       </div> 
 
       <div class="div1" style="width: 288px;" onclick="parent.javaScriptAddSlip('slip', '164518117;;;-;6.50;0;Maç Bahsi;164518117')"> 
 
        <table class="table1"> 
 
         <tr class="menuClickable"> 
 
          <td class="td1">6.50</td> 
 
         </tr> 
 
         <tr class="menuClickable"> 
 
          <td class="td2">Beraberlik</td> 
 
         </tr> 
 
        </table> 
 
       </div> 
 
       <div class="div1" style="width: 288px;" onclick="parent.javaScriptAddSlip('slip', '164518117;;;-;1.18;2;Maç Bahsi;164518117')"> 
 
        <table class="table1"> 
 
         <tr class="menuClickable"> 
 
          <td class="td1">1.18</td> 
 
         </tr> 
 
         <tr class="menuClickable"> 
 
          <td class="td2">Real Madrid</td> 
 
         </tr> 
 
        </table> 
 
       </div> 
 
      </td> 
 
     </tr> 
 
    </table> 
 
    <table style="width: 100%" cellpadding="0" cellspacing="0"> 
 
     <tr> 
 
      <td style="background: #36461f;color: #ffffff;font-weight: bold;padding: 2px;font-size: 12px;height: 25px;">Ilk Yari Bahsi</td> 
 
     </tr> 
 
     <tr> 
 
      <td> 
 
       <div class="div1" style="width: 288px;" onclick="parent.javaScriptAddSlip('slip', '164518128;;;-;8.50;1;İlk Yarı Bahsi;164518128')"> 
 
        <table class="table1"> 
 
         <tr class="menuClickable"> 
 
          <td class="td1">8.50</td> 
 
         </tr> 
 
         <tr class="menuClickable"> 
 
          <td class="td2">Club America Mexico</td> 
 
         </tr> 
 
        </table> 
 
       </div> 
 
       <div class="div1" style="width: 288px;" onclick="parent.javaScriptAddSlip('slip', '164518128;;;-;3.05;0;İlk Yarı Bahsi;164518128')"> 
 
        <table class="table1"> 
 
         <tr class="menuClickable"> 
 
          <td class="td1">3.05</td> 
 
         </tr> 
 
         <tr class="menuClickable"> 
 
          <td class="td2">Beraberlik</td> 
 
         </tr> 
 
        </table> 
 
       </div> 
 
       <div class="div1" style="width: 288px;" onclick="parent.javaScriptAddSlip('slip', '164518128;;;-;1.50;2;İlk Yarı Bahsi;164518128')"> 
 
        <table class="table1"> 
 
         <tr class="menuClickable"> 
 
          <td class="td1">1.50</td> 
 
         </tr> 
 
         <tr class="menuClickable"> 
 
          <td class="td2">Real Madrid</td> 
 
         </tr> 
 
        </table> 
 
       </div> 
 
      </td> 
 
     </tr> 
 
    </table> 
 
    <table style="width: 100%" cellpadding="0" cellspacing="0"> 
 
     <tr> 
 
      <td style="background: #36461f;color: #ffffff;font-weight: bold;padding: 2px;font-size: 12px;height: 25px;">İkinci Yarı Bahsi</td> 
 
     </tr> 
 
     <tr> 
 
      <td> 
 
       <div class="div1" style="width: 288px;" onclick="parent.javaScriptAddSlip('slip', '164518133;;;-;8.50;1;İkinci Yarı Bahsi;164518133')"> 
 
        <table class="table1"> 
 
         <tr class="menuClickable"> 
 
          <td class="td1">8.50</td> 
 
         </tr> 
 
         <tr class="menuClickable"> 
 
          <td class="td2">Club America Mexico</td> 
 
         </tr> 
 
        </table> 
 
       </div> 
 
       <div class="div1" style="width: 288px;" onclick="parent.javaScriptAddSlip('slip', '164518133;;;-;3.70;0;İkinci Yarı Bahsi;164518133')"> 
 
        <table class="table1"> 
 
         <tr class="menuClickable"> 
 
          <td class="td1">3.70</td> 
 
         </tr> 
 
         <tr class="menuClickable"> 
 
          <td class="td2">Beraberlik</td> 
 
         </tr> 
 
        </table> 
 
       </div> 
 
       <div class="div1" style="width: 288px;" onclick="parent.javaScriptAddSlip('slip', '164518133;;;-;1.40;2;İkinci Yarı Bahsi;164518133')"> 
 
        <table class="table1"> 
 
         <tr class="menuClickable"> 
 
          <td class="td1">1.40</td> 
 
         </tr> 
 
         <tr class="menuClickable"> 
 
          <td class="td2">Real Madrid</td> 
 
         </tr> 
 
        </table> 
 
       </div> 
 
      </td> 
 
     </tr> 
 
    </table> 
 
    <br /> 
 
    <br /> 
 
    <br /> 
 
</body> 
 
</html>

+0

http://stackoverflow.com/questions/41160124/html-agilitypack-use-webclient-to-download-html-from-a-website-and-store-it/41160185#41160185 – mybirthname

回答

0

你可以使用類似的東西:

var document = new HtmlDocument(); 
document.LoadHtml(text); 
var tables = document.Descendants("table").ToList(); 
foreach (var table in tables) 
{ 
    var node = HtmlNode.CreateNode(table.InnerHtml); 

    var td = node.SelectNodes("//td[@style='background: #36461f;color: #ffffff;font-weight: bold;padding: 2px;font-size: 12px;height: 25px;'").FirstOrDefault(); 
    ... 
    var divs = node.SelectNodes("//div[@class='div1']").ToList(); 
    ... 
} 
+0

謝謝感興趣。但是,我無法理解「後代」。我添加這個命名空間「System.Xml.Linq」但沒有奏效。 –

+0

此方法從HtmlNode類放置在HtmlAgilityPack命名空間中。 – Artem

+0

對不起,但它不起作用。 Visual Studio查看缺少的參考。 HtmlAgilityPack名稱空間已附加。 –

0

我這樣做了。但這是一個很長的路。如果有更好的捷徑和更好的方法,請寫。

  HtmlWeb h = new HtmlWeb(); 
      HtmlDocument doc = h.Load(Server.MapPath("xml/htmlpage.html")); 
      HtmlNodeCollection n = doc.DocumentNode.SelectNodes("//html/body/table"); 

      string item; 
      string[] items; 
      string oran, oranadi; 
      int oran_id, secim; 
      for (int i = 1; i < n.Count + 1; i++) 
      { 
       HtmlNode ns = n[i - 1].SelectSingleNode(string.Format("//html/body/table[{0}]/tr[1]/td", i)); 
       HtmlNodeCollection nc = n[i-1].SelectNodes(string.Format("//html/body/table[{0}]/tr[2]/td[1]/div", i)); 
       Response.Write(string.Format("{0} --> {1}<br/>", i, ns.InnerHtml)); 
       for (int j = 1; j < nc.Count + 1; j++) 
       { 
        HtmlNode ncs = nc[j - 1].SelectSingleNode(string.Format("//html/body/table[{0}]/tr[2]/td[1]/div[{1}]", i, j)); 
        item = ncs.Attributes[2].Value.ToString(); 
        items = item.Split(';'); 
        oran_id = Convert.ToInt32(items[7].Replace("')", "")); 
        oranadi = items[6].ToString(); 
        secim = Convert.ToInt32(items[5]); 
        oran = items[4]; 

        Response.Write(string.Format("{0} --> {1} - {2} - {3} - {4} <br/>", j, secim, oran_id, oranadi, oran)); 
       } 
      }