PHP簡單的HTML DOM刮擦外部URL

我想構建一個我的個人項目，但是當我使用簡單的HTML DOM類時，我有點卡住了。PHP簡單的HTML DOM刮擦外部URL

我想要做的是刮一個網站，並檢索所有的內容，它是內部的HTML，匹配某個類。

到目前爲止我的代碼是：

<?php 
    error_reporting(E_ALL); 
    include_once("simple_html_dom.php"); 
    //use curl to get html content 
    $url = 'http://www.peopleperhour.com/freelance-seo-jobs'; 

    $html = file_get_html($url); 

    //Get all data inside the <div class="item-list"> 
    foreach($html->find('div[class=item-list]') as $div) { 
    //get all div's inside "item-list" 
    foreach($div->find('div') as $d) { 
    //get the inner HTML 
    $data = $d->outertext; 
    } 
    } 
print_r($data) 
    echo "END"; 
    ?>

所有我這個得到的是「END」，沒有別的輸出都一個空白頁。

來源

2013-12-09 MikeF

[Scrape web page contents]的可能重複（http://stackoverflow.com/questions/584826/scrape-web-page-contents） –

我想，你可能希望這樣的事情

$url = 'http://www.peopleperhour.com/freelance-seo-jobs'; 
$html = file_get_html($url); 
foreach ($html->find('div.item-list div.item') as $div) { 
    echo $div . '<br />'; 
};

這會給你這樣的事情（如果你添加適當的樣式表，它會被很好地顯示）

enter image description here

來源

2013-12-09 16:15:53

完美！按預期工作。定義類時如何起作用：'div [class = item-list]'？ – MikeF

您可能需要引用項目列表。 – pguardiario

看來你的$ data變量在每次迭代時被賦予不同的值。試試這個：

$data = ""; 
foreach($html->find('div[class=item-list]') as $div) { 
    //get all divs inside "item-list" 
    foreach($div->find('div') as $d) { 
     //get the inner HTML 
     $data .= $d->outertext; 
    } 
} 
print_r($data)

我希望有幫助。

來源

2013-12-09 16:14:04

是的，你是對的。我會編輯答案。 –

不幸的是，這仍然行不通。但是，在定義類如Sheikh的答案時，它完美地起作用。 – MikeF

PHP簡單的HTML DOM刮擦外部URL

回答

相關問題