2014-10-09 77 views
1

我很難從本網站的括號中提取整數。網站報廢使用正則表達式等

從網站上的標記部分:

<span class="b-label b-link-number" data-num="(322206)">Music &amp; Video</span> 
<span class="b-label b-link-number" data-num="(954218)">Toys, Hobbies &amp; Games</span> 
<span class="b-label b-link-number" data-num="(502981)">Kids, Baby &amp; Maternity</span> 

如何提取括號中的整數?

所需的輸出:

322206 
954218 
502981 

我應該使用正則表達式,因爲他們得到了相同的類名(而不是正則表達式,因爲裏面有支架其他有害元素以及從源代碼中括號內獲得)。

通常情況下,這將是我用來提取信息的方式:

<?php 
//header('Content-Type: text/html; charset=utf-8'); 
$grep = new DoMDocument(); 
@$grep->loadHTMLFile("http://global.rakuten.com/en/search/?tl=&k="); 
$finder = new DomXPath($grep); 
$class = "b-list-item"; 
$nodes = $finder->query("//*[contains(@class, '$class')]"); 

foreach ($nodes as $node) { 
    $span = $node->childNodes; 
    $search = array(0,1,2,3,4,5,6,7,8,9,'(',')'); 
    $categories = str_replace($search, '', $span->item(0)->nodeValue); 
    echo '<br>' . '<font color="green">' . $categories . ' ' . '</font>' ; 

} 
?> 

但因爲我想要的數據是,在標籤內,我怎麼提取呢?

+0

的可能重複[你如何解析和PHP程序的HTML/XML?](http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process- HTML,XML功能於PHP) – cdhowie 2014-10-09 04:21:24

回答

2

添加您當前的代碼,它只是向前伸直,只是改變$class爲你所希望的是類,並使用->getAttribute()得到那些data-num的:

$grep = new DoMDocument(); 
@$grep->loadHTMLFile("http://global.rakuten.com/en/search/?tl=&k="); 
$finder = new DomXPath($grep); 
$class = "b-link-number"; // change the span class 
$nodes = $finder->query("//*[contains(@class, '$class')]"); // target those 

$numbers = array(); 
foreach ($nodes as $node) { // for every found elemenet 
    $link_num = $node->getAttribute('data-num'); // get the attribute `data-num` 
    $link_num = str_replace(['(', ')'], '', $link_num); // simply remove those parenthesis 
    $numbers[] = $link_num; // push it inside the container 
} 

echo '<pre>'; 
print_r($numbers);