PHP的DOMDocument - 匹配和刪除網址

我試圖使用提取從HTML頁面鏈接DOM：PHP的DOMDocument - 匹配和刪除網址

$html = file_get_contents('links.html'); 
$DOM = new DOMDocument(); 
$DOM->loadHTML($html); 
$a = $DOM->getElementsByTagName('a'); 
foreach($a as $link){ 
    //echo out the href attribute of the <A> tag. 
    echo $link->getAttribute('href').'<br/>'; 
}

輸出：

http://dontwantthisdomain.com/dont-want-this-domain-name/ 
http://dontwantthisdomain2.com/also-dont-want-any-pages-from-this-domain/ 
http://dontwantthisdomain3.com/dont-want-any-pages-from-this-domain/ 
http://domain1.com/page-X-on-domain-com.html 

http://dontwantthisdomain.com/dont-want-link-from-this-domain-name.html 
http://dontwantthisdomain2.com/dont-want-any-pages-from-this-domain/ 
http://domain.com/page-XZ-on-domain-com.html 

http://dontwantthisdomain.com/another-page-from-same-domain-that-i-dont-want-to-be-included/ 
http://dontwantthisdomain2.com/same-as-above/ 
http://domain3.com/page-XYZ-on-domain3-com.html

我想刪除匹配所有結果dontwantthisdomain.com ，dontwantthisdomain2.com和dontwantthisdomain3.com所以輸出將看起來像這樣：

http://domain1.com/page-X-on-domain-com.html 
http://domain.com/page-XZ-on-domain-com.html 
http://domain3.com/page-XYZ-on-domain3-com.html

任何想法？ :)

來源

2013-09-25 Kris

'$ x = new DOMXPath（$ DOM）; $ x-> query（'// a/@ href/[not（contains（text（），「dontwantthisdomain」））]）;'：P – kojiro

@ yann-milin你可以看看，讓我知道你認爲？謝謝pal – Kris

@kojiro：它接縫，你的代碼導致錯誤。你可以很難過嗎？謝謝:) – Kris

我認爲你應該使用正則表達式.Google它和樂趣

來源

2013-09-25 02:40:38 user2813412

用$ html = preg_replace（'＃ Kris

PHP的DOMDocument - 匹配和刪除網址

回答

相關問題