我想清理一些使用DOMDocument的錯誤html。在HTML有一個<div class="article">
元素,與<br/><br/>
代替</p><p>
- 我想正則表達式這些幾段......但似乎無法讓我的節點退回到原始文檔:在PHP中使用DOMDocument替換html
//load entire doc
$doc = new DOMDocument();
$doc->loadHTML($htm);
$xpath = new DOMXpath($doc);
//get the article
$article = $xpath->query("//div[@class='article']")->parentNode;
//get as string
$article_htm = $doc->saveXML($article);
//regex the bad markup
$article_htm2 = preg_replace('/<br\/><br\/>/i', '</p><p>', $article_htm);
//create new doc w/ new html string
$doc2 = new DOMDocument();
$doc2->loadHTML($article_htm2);
$xpath2 = new DOMXpath($doc2);
//get the original article node
$article_old = $xpath->query("//div[@class='article']");
//get the new article node
$article_new = $xpath2->query("//div[@class='article']");
//replace original node with new node
$article->replaceChild($article_old, $article_new);
$article_htm_new = $doc->saveXML();
//dump string
var_dump($article_htm_new);
我得到的是500內部服務器錯誤...不知道我做錯了什麼。