2015-07-19 55 views
1

我有例如HTML文檔:如何在PHP中使用DomDocument或XPath獲取HTML文檔的確切結構?

<!DOCTYPE html> 
<html> 
<head> 
    <title>Webpage</title> 
</head> 
<body> 
<div class="content"> 
    <div> 
     <p>Paragraph</p> 
    </div> 
    <div> 
     <a href="someurl">This is an anchor</a> 
    </div> 
    <p>This is a paragraph inside a div</p> 
</div> 
</body> 
</html> 

我要搶在其DIV類的content的確切結構。

在PHP中使用的DomDocument如果我使用getElementsByTagName()方法獲取DIV,我得到這個:

DOMElement Object 
    (
    [tagName] => div 
    [schemaTypeInfo] => 
    [nodeName] => div 
    [nodeValue] => 

     Paragraph 


     This is an anchor 

    This is a paragraph inside a div 

    [nodeType] => 1 
    [parentNode] => (object value omitted) 
    [childNodes] => (object value omitted) 
    [firstChild] => (object value omitted) 
    [lastChild] => (object value omitted) 
    [previousSibling] => (object value omitted) 
    [nextSibling] => (object value omitted) 
    [attributes] => (object value omitted) 
    [ownerDocument] => (object value omitted) 
    [namespaceURI] => 
    [prefix] => 
    [localName] => div 
    [baseURI] => 
    [textContent] => 

     Paragraph 


     This is an anchor 

    This is a paragraph inside a div 

) 

我怎樣才能得到這個代替:

<div class="content"> 
    <div> 
     <p>Paragraph</p> 
    </div> 
    <div> 
     <a href="someurl">This is an anchor</a> 
    </div> 
    <p>This is a paragraph inside a div</p> 
</div> 

有沒有這樣做的任何方式這個?

回答

0

假設,$海峽包含HTML

// Create DomDocument 
$doc = new DomDocument(); 
$doc->loadHTML($str); 
// Find needed div 
$xpath = new DOMXpath($doc); 
$elements = $xpath->query('//div[@class = "content"]'); 
// What to do if divs more that one? 
if ($elements->length != 1) die("some divs in the document have class 'content'"); 
// Take first 
$div = $elements->item(0); 
// Echo content of node $div 
echo $doc->saveHTML($div); 

結果

<div class="content"> 
    <div> 
     <p>Paragraph</p> 
    </div> 
    <div> 
     <a href="someurl">This is an anchor</a> 
    </div> 
    <p>This is a paragraph inside a div</p> 
</div> 
相關問題