5
我想通過使用HtmlAgilityPack解析html來獲取HTML表格中的信息。c#使用HtmlAgilityPack從HTML表格中獲取數據
這裏是HTML的樣子:
...
...
...
<tbody>
<tr>
<td class="style_19" style="vertical-align: baseline;">
<div class="style_18">AA00857</div>
</td>
<td class="style_19" style="vertical-align: baseline;">
<div></div>
<div class="style_20">TPRCF</div>
</td>
<td class="style_19" style="vertical-align: baseline;">
<div class="style_21"></div>
</td>
<td class="style_19" style="vertical-align: baseline;">
<div class="style_21">16908/2</div>
</td>
<td class="style_19" style="vertical-align: baseline;">
<div class="style_18"> ETG_C</div>
</td>
</tr>
<tr>
<td class="style_19" style="vertical-align: baseline;">
<div class="style_18">AA</div>
</td>
<td class="style_19" style="vertical-align: baseline;">
<div></div>
<div class="style_20">TPRCF</div>
</td>
<td class="style_19" style="vertical-align: baseline;">
<div class="style_21"></div>
</td>
<td class="style_19" style="vertical-align: baseline;">
<div class="style_21">16909/19</div>
</td>
<td class="style_19" style="vertical-align: baseline;">
<div class="style_18"> ETG_C</div>
</td>
</tr>
<tr>
<td class="style_19" style="vertical-align: baseline;">
<div class="style_18">AA</div>
</td>
<td class="style_19" style="vertical-align: baseline;">
<div></div>
<div class="style_20">TPRCF</div>
</td>
<td class="style_19" style="vertical-align: baseline;">
<div class="style_21"></div>
</td>
<td class="style_19" style="vertical-align: baseline;">
<div class="style_21">16907/7</div>
</td>
<td class="style_19" style="vertical-align: baseline;">
<div class="style_18"> ETG_C</div>
</td>
</tr>
...
...
我需要從提取上述這些值:
AA00857, TPRCF, 16908/2, ETG_C
到目前爲止,所有我已經是這樣的:
HtmlWeb hw = new HtmlWeb();
HtmlAgilityPack.HtmlDocument htmlDoc = hw.Load(@"http://www.some123123site.com/index");
if (htmlDoc.DocumentNode != null)
{
HtmlAgilityPack.HtmlNode bodyNode = htmlDoc.DocumentNode.SelectSingleNode("//tbody");
if (bodyNode != null)
{
// Do something with bodyNode
}
}
請幫忙!
錯誤\t \t 1「HtmlAgilityPack.HtmlDocument」不包含關於「DocumentElement」和沒有擴展方法「DocumentElement」接受型的第一參數「HtmlAgilityPack.HtmlDocument」可能定義被發現錯誤'HtmlAgilityPack.HtmlDocument'不包含'DocumentElement'的定義,並且沒有找到接受'HtmlAgilityPack.HtmlDocument'類型的第一個參數的擴展方法'DocumentElement' – 2011-01-07 21:49:06