如何獲取頁面內容

我正在嘗試爲我的網站製作最新消息，如功能。對於這一點，我已經做了以下如何獲取頁面內容

$dom = new domDocument; 
@$dom->loadHTML(file_get_contents($url)); 
$dom->preserveWhiteSpaces = false; 
$linksToStore = $dom->getElementsByTagName('a'); 

foreach($linksToStore as $tag){ 
    $links[$tag->getAttribute('href')]= $tag->childNodes->item(0)->nodeValue; 
}

我怎樣才能獲得內容不被那些與特定域的鏈接指向的網頁做了一個網絡爬蟲，並具有能夠收集來自網頁鏈接起來到現在在我的情況下是'醫療'？

來源

2012-11-25 vaibhav

使用此http://simplehtmldom.sourceforge.net/庫從頁面提取內容。選擇器的工作原理與jQuery相同，這使得它可以非常有效地提取內容。

此外，請檢查此http://davidwalsh.name/php-notifications以瞭解更多

來源

2012-11-25 08:24:29

如何獲取頁面內容

回答

相關問題