2014-05-17 21 views
0

我是DOMXPath的新手,但我想了解更多信息。目前,我有一個HTML的結構是這樣的:如何從DOMXPath查詢獲取特定值?

<span class="1"> 
     <div class="headerClass"> 
      Here you have <span class="spanClass1">some text</span>. And here there is <span class="spanClass2">even more text</span> 
     </div> 
     <table class="tableClass" id="tableID"> 
      <tr> 
       <td>some text</td> 
       <td>some text</td> 
       <td>some text</td> 
      </tr> 
      <tr> 
       <td>some text</td> 
       <td>some text</td> 
       <td><a href="http://www.website1.com" target="_blank">My Link</a></td> 
      </tr> 
      <tr> 
       <td>some text</td> 
       <td>some text</td> 
       <td><a href="http://www.website2.com" target="_blank">My Link</a></td> 
      </tr> 
     </table> 
    </span> 

    <span class="2"> 
     <div class="headerClass"> 
      Here you have <span class="spanClass1">some text</span>. And here there is <span class="spanClass2">even more text</span> 
     </div> 
     <table class="tableClass" id="tableID"> 
      <tr> 
       <td>some text</td> 
       <td>some text</td> 
       <td>some text</td> 
      </tr> 
      <tr> 
       <td>some text</td> 
       <td>some text</td> 
       <td><a href="http://www.website1.com" target="_blank">My Link</a></td> 
      </tr> 
      <tr> 
       <td>some text</td> 
       <td>some text</td> 
       <td><a href="http://www.website2.com" target="_blank">My Link</a></td> 
      </tr> 
     </table> 
    </span> 

... and the spans continue: 3, 4, 5 ... etc 

爲了取回源文件HTML代碼,我用這:

$oDomXpath = new DOMXpath($oDom); 
$query = "//span[number(@class)=number(@class)]"; 
$oDomObject = $oDomXpath->query($query); 

foreach ($oDomObject as $oObject) { 
    // WHAT GOES HERE???? 
} 

我需要一個數組來存儲以下值:

  1. 所有<div class="headerClass">的純文本沒有HTML標記。
  2. 全部文字<span class="spanClass2">
  3. 所有網址都在表格內。表格可以具有從0到多個的任意數量的行。

我怎樣才能做到這一點?我需要將哪些內容放入foreach循環中?我是否需要運行另一個查詢?

非常感謝您的幫助!

回答

2

你有選擇,你可以使用幾個XPath查詢,並通過一個獲取值,或者你可以用多種途徑建立一個獨特的XPath查詢:

<pre><?php 
$dom = new DOMDocument(); 
@$dom->loadHTMLFile('yourfile.html'); 

$xpath = new DOMXPath($dom); 

$xquery = <<<'EOD' 
//span[number(@class)[email protected]]/@class | 
//span[number(@class)[email protected]]/div[@class='headerClass'] | 
//span[number(@class)[email protected]]/div[@class='headerClass']/span[@class='spanClass2'] | 
//span[number(@class)[email protected]]/table[@class='tableClass']/tr/td/a 
EOD; 

$nodes = $xpath->query($xquery); 

foreach ($nodes as $node) { 
    if ($node->nodeType == XML_ELEMENT_NODE) 
     switch($node->nodeName): 
      case 'div' : echo '<br/>div content: ' . $node->nodeValue; break; 
      case 'span': echo '<br/>span content: ' . $node->nodeValue; break; 
      default : echo '<br/>url: ' . $node->getAttribute('href'); 
     endswitch; 
    else 
     echo '<br/><br/>number: ' . $node->value; 
} 
+0

非常感謝您!它真的引導我到解決方案!乾杯! – karlosuccess