2012-02-13 57 views
1

我試圖穿越一個html文件:獲得正確的XPATH表達式DIV CLASS

<body class="style_0"> 
     <div> 
      <div class="style_1">Pending Test List</div> 
      <table style=" width: 100%;" id="AUTOGENBOOKMARK_4365445353431356880"> 
       <col> 
       <col> 
       <tbody> 
        <tr> 
         <td style="vertical-align: baseline;"> 
          <div class="style_4">Pending Test List</div> 
         </td> 
         <td style="vertical-align: baseline;"> 
          <div class="style_5">SOME AGENCY Laboratories, Inc.</div> 
         </td> 
        </tr> 
       </tbody> 
      </table> 
      <table class="style_6" style=" width: 4.531in;" id="AUTOGENBOOKMARK_5083738604442918131"> 
       <col style=" width: 1in;"> 
       <col class="style_7" style=" width: 0.75in;"> 
       <col class="style_8" style=" width: 0.6in;"> 
       <col style=" width: 0.75in;"> 
       <col style=" width: 2.375in;"> 
       <tbody> 
        <tr class="style_9" style=" height: 0.5in;"> 
         <td style="vertical-align: middle;"> 
          <div class="style_10">Report Range:</div> 
         </td> 
         <td style="vertical-align: middle;"> 
          <div class="style_11">01/01/2012</div> 
         </td> 
         <td style="vertical-align: middle;"> 
          <div class="style_12">through</div> 
         </td> 
         <td style="vertical-align: middle;"> 
          <div class="style_13">01/31/2012</div> 
         </td> 
         <td style="vertical-align: middle;"> 
          <div class="style_14">(by Date Entered)</div> 
         </td> 
        </tr> 
       </tbody> 
      </table> 
      <table class="style_15" style=" width: 100%;" id="AUTOGENBOOKMARK_7602283385844673591" iid="/526 

(QuRs78576248:0)"> 
       <col style=" width: 0.75in;"> 
       <col style=" width: 1.25in;"> 
       <col style=" width: 1in;"> 
       <col style=" width: 1.5in;"> 
       <col style=" width: 1.5in;"> 
       <col style=" width: 1.5in;"> 
       <col> 
       <thead> 
        <tr> 
         <td colspan="4" style="vertical-align: baseline;"></td> 
         <td style="vertical-align: baseline;"></td> 
         <td style="vertical-align: baseline;"></td> 
         <td style="vertical-align: baseline;"></td> 
        </tr> 
        <tr> 
         <td style="vertical-align: baseline;"> 
          <div class="style_16">Entered</div> 
         </td> 
         <td style="vertical-align: baseline;"> 
          <div class="style_16">Spec. ID</div> 
         </td> 
         <td style="vertical-align: baseline;"> 
          <div class="style_16">Batch/Pos.</div> 
         </td> 
         <td style="vertical-align: baseline;"> 
          <div class="style_16">Test</div> 
         </td> 
         <td style="vertical-align: baseline;"> 
          <div class="style_16">Client ID</div> 
         </td> 
         <td style="vertical-align: baseline;"> 
          <div class="style_16">Client Name</div> 
         </td> 
         <td style="vertical-align: baseline;"> 
          <div class="style_16">Agency</div> 
         </td> 
        </tr> 
       </thead> 
       <tbody> 
        <tr> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_18">1/30/12</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_19">ZZ324sdf</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_18">51446/75</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">HOLD_DE</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">234234</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">smith, john</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">PPPM-6P - SOME AGENCY</div> 
         </td> 
        </tr> 
        <tr> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_18">1/31/12</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_19">SFD3434</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_18">51668/17</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">HOLD_DE</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">FOY, EL</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">FOY, ALEX</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">someagency &amp; Associates LLC</div> 
         </td> 
        </tr> 
        <tr> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_18">1/31/12</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_19">SFD3434</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_18">51668/25</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">HOLD_DE</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">JAMISON, PA</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">JAMISON, ROY</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">someagency &amp; Associates LLC</div> 
         </td> 
        </tr> 
        <tr> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_18">1/31/12</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_19">SFD3434</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_18">51669/34</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">HOLD_DE</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">NEWMAN, SO</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">NEWMAN, ALEX</div> 
         </td> 
         <td class="style_17" style="vertical-align: baseline;"> 
          <div class="style_20">someagency &amp; Associates LLC</div> 
         </td> 
        </tr> 
       </tbody> 
       <tfoot> 
        <tr> 
         <td colspan="2" style="vertical-align: baseline;"> 
          <div class="style_21">Total Tests:</div> 
         </td> 
         <td style="vertical-align: baseline;"> 
          <div class="style_22">4</div> 
         </td> 
         <td style="vertical-align: baseline;"></td> 
         <td style="vertical-align: baseline;"></td> 
         <td style="vertical-align: baseline;"></td> 
         <td style="vertical-align: baseline;"></td> 
        </tr> 
       </tfoot> 
      </table> 
      <table style=" width: 100%;" id="AUTOGENBOOKMARK_8507236727661888074"> 
       <col> 
       <col> 
       <col> 
       <tbody> 
        <tr> 
         <td style="vertical-align: baseline;"> 
          <div class="style_2"> 
           <br>Feb 13, 2012 9:37 AM</div> 
         </td> 
         <td style="vertical-align: baseline;"> 
          <div class="style_3"> 
           <br> 
           <div style="text-align:center;">Page 1</div> 
          </div> 
         </td> 
         <td style="vertical-align: baseline;"></td> 
        </tr> 
       </tbody> 
      </table> 
     </div> 
    </body> 

此數據:

enter image description here

到目前爲止,我有這樣的:

foreach (var row in htmlSnippet.DocumentNode.SelectNodes("//table[@class = 'style_15']/tbody/tr")) 
       { 
        foreach (var cell in row.SelectNodes("div[@class='*']")) 
        { 
         textBox1.Text = cell.InnerHtml.ToString(); 
        } 
       } 

但是我沒有回報任何東西!

這條線的工作:

//table[@class = 'style_15']/tbody/tr 

但這並不返回參選:

("div[@class='*']")) 

請讓我知道我做錯了!我需要幫助的每一塊數據的返回,如下圖所示的(除了字段名)

+1

您似乎已經跳過TD ... row.SelectNodes( 「TD/DIV [@class = '*']」) – 2012-02-13 19:21:38

+0

@amit_g感謝你的幫助,沒有工作 – 2012-02-13 19:26:20

+0

@patrick IM對不起這些工作。它通過外表循環與//表... ....但永遠不會進入內循環 – 2012-02-13 19:32:37

回答

2

你可能想簡單地div[@class] —具有class屬性的div元素。

噢,值得注意的是,您提供的HTML/XML示例是不是格式良好的。我不得不刪除所有col元素,並關閉br元素。也許,對於C#來說,這是一個問題......我知道這是針對XSL的......一般來說......不確定XPath。

我沒有時間編寫了一個C#示例,但這裏有一個簡單的XSL:

<xsl:template match="/"> 
    <so> 
    <xsl:apply-templates select="//table[@class = 'style_15']/tbody/tr"/> 
    </so> 
</xsl:template> 
<xsl:template match="div[@class]"> 
    <xsl:copy-of select="."/> 
</xsl:template> 

我得到這樣的輸出:

<so> 
    <div class="style_18">1/30/12</div> 
    <div class="style_19">ZZ324sdf</div> 
    <div class="style_18">51446/75</div> 
    <div class="style_20">HOLD_DE</div> 
    <div class="style_20">234234</div> 
    <div class="style_20">smith, john</div> 
    <div class="style_20">PPPM-6P - SOME AGENCY</div> 
    <div class="style_18">1/31/12</div> 
    <div class="style_19">SFD3434</div> 
    <div class="style_18">51668/17</div> 
    <div class="style_20">HOLD_DE</div> 
    <div class="style_20">FOY, EL</div> 
    <div class="style_20">FOY, ALEX</div> 
    <div class="style_20">someagency &amp; Associates LLC</div> 
    <div class="style_18">1/31/12</div> 
    <div class="style_19">SFD3434</div> 
    <div class="style_18">51668/25</div> 
    <div class="style_20">HOLD_DE</div> 
    <div class="style_20">JAMISON, PA</div> 
    <div class="style_20">JAMISON, ROY</div> 
    <div class="style_20">someagency &amp; Associates LLC</div> 
    <div class="style_18">1/31/12</div> 
    <div class="style_19">SFD3434</div> 
    <div class="style_18">51669/34</div> 
    <div class="style_20">HOLD_DE</div> 
    <div class="style_20">NEWMAN, SO</div> 
    <div class="style_20">NEWMAN, ALEX</div> 
    <div class="style_20">someagency &amp; Associates LLC</div> 
</so> 

這只是一箇中間輸出顯示XPath工作正常。

希望這會有所幫助。

+0

這就是非常感謝這麼多〜!!!!!!! – 2012-02-13 19:37:24

3

*通常用於任何元件匹配或屬性名稱,又沒。如果要將所有具有class屬性的div元素與任何值匹配,請使用@class

foreach (var row in htmlSnippet.DocumentNode.SelectNodes("//table[@class = 'style_15']/tbody/tr/td")) 
{ 
    foreach (var cell in row.SelectNodes("div[@class]")) 
    { 
     textBox1.Text = cell.InnerHtml.ToString(); 
    } 
}