如何使用捲曲的preg_match _all DIV內容

我努力實踐捲曲，但並不順利 Pleasw告訴我，什麼是錯的這裏是我的代碼如何使用捲曲的preg_match _all DIV內容

<?php 
$ch = curl_init(); 
curl_setopt($ch, CURLOPT_URL, "http://xxxxxxx.com/"); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); 
curl_setopt($ch, CURLOPT_USERAGENT, "Google Bot"); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); 

$downloaded_page = curl_exec($ch); 
curl_close($ch); 
preg_match_all('/<div\s* class =\"abc\">(.*)<\/div>/', $downloaded_page, $title); 
echo "<pre>"; 
print($title[1]); 
echo "</pre>";

和警告Notice: Array to string conversion

我要解析HTML是這樣

<div class="abc"> 
<ul> blablabla </ul> 
<ul> blablabla </ul> 
<ul> blablabla </ul> 
</div>

來源

2013-10-13 user2492364

$標題不是一個數組，但數組的數組。查看手冊頁上的示例：http://php.net/manual/en/function.preg-match-all.php – Ashalynd

preg_match_all返回一個數組數組。

如果你的代碼是：

preg_match_all('/<div\s+class="abc">(.*)<\/div>/', $downloaded_page, $title);

實際上要做到以下幾點：

echo "<pre>"; 
foreach ($title[1] as $realtitle) { 
    echo $realtitle . "\n"; 
} 
echo "</pre>";

，因爲它會搜索所有div的是具有類「ABC」。我也建議你加強你的正則表達式，使之更加健壯。

preg_match_all('/<div[^>]+class="abc"[^>]*>(.*)<\/div>/', $downloaded_page, $title);

這將匹配以及

BTW：的DOMDocument緩慢的地獄，我發現有時正則表達式（這取決於你的文檔的大小）可以給40倍的速度增加。只是保持簡單。

最佳，尼古拉斯

來源

2013-10-13 21:30:36 Nicolas

Don't parse HTML with regex.

$ch = curl_init(); 
curl_setopt($ch, CURLOPT_URL, 'http://www.lipsum.com/'); 
curl_setopt($ch, CURLOPT_HEADER, false); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); 
$html = curl_exec($ch); 
curl_close($ch); 

$dom = new DOMDocument; 
@$dom->loadHTML($html); 
$xpath = new DOMXPath($dom); 
# foreach ($xpath->query('//div') as $div) { // all div's in html 
foreach ($xpath->query('//div[contains(@class, "abc")]') as $div) { // all div's that have "abc" classname 
    // $div->nodeValue contains fetched DIV content 
}

來源

2013-10-13 10:04:18

如何使用捲曲的preg_match _all DIV內容

回答

相關問題