2015-05-06 43 views
0

我希望使用php curl從一個網頁獲取信息,我使用php正則表達式過濾數據以匹配標記,但不工作。使用Php CUrl和正則表達式獲取數據

這裏是網頁click here

這裏是我的PHP代碼

if(preg_match('/<div class="price-gruop"><span class="text-price">Price:<\/span>(.*?)<\/div>/', get_page($url),$matches2)) 
     { 
     $matches2[1] = strtolower($matches2[1]); 
     $data['price']=$matches2[1]; 

     } 

function get_page($url){ 
$ch = curl_init(); 
curl_setopt($ch, CURLOPT_URL, $url); 
//curl_setopt($ch, CURLOPT_PROXY, $proxy); 
curl_setopt($ch, CURLOPT_HEADER, 0); // return headers 0 no 1 yes 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // return page 1:yes 
curl_setopt($ch, CURLOPT_TIMEOUT, 200); // http request timeout 20 seconds 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // Follow redirects, need this if the url changes 
curl_setopt($ch, CURLOPT_MAXREDIRS, 2); //if http server gives redirection responce 
curl_setopt($ch, CURLOPT_USERAGENT, 
    "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7"); 
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookies.txt"); // cookies storage/here the changes have been made 
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookies.txt"); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // false for https 
curl_setopt($ch, CURLOPT_ENCODING, "gzip"); // the page encoding 

$data = curl_exec($ch); // execute the http request 
curl_close($ch); // close the connection 
return $data; 
} 

我得到的字符串空值。請告訴我如何在標籤之間獲得價值。

+0

退房[xpath的](http://php.net/manual/en/class.domxpath.php)。如果有很多元素正在拼湊,更適合於解析html – gwillie

回答

1

使用PHP DOM下載simple_html_dom.php從這裏link

$url = "http://vikramshopping.com/reallife-3in1-printscriptcopy-5"; 
// Include the library 
include('simple_html_dom.php'); 

// Retrieve the DOM from a given URL 
$html = file_get_html($url); 

這是你需要什麼,我明白

// Find all DIV tags that have a class of "price-gruop" 
foreach($html->find('div.price-gruop') as $e) { 
    echo $e->outertext . '<br>'; 
} 

或與的preg_match

$html = '<div class="price-gruop"> 
          <span class="text-price">Price:</span> 
                 INR135.00             </div>'; 
if(preg_match('/<div class="price-gruop">\s*<span class="text-price">\s*Price:\s*<\/span>\s*(.*)\s*<\/div>/', $html,$matches)) 
echo '<pre>';print_r("Price: ".$matches[1]);echo '</pre>'; 

Demo with preg_match

也可以使用其他實施例從下面

// Find all "A" tags and print their HREFs 
foreach($html->find('a') as $e) 
    echo $e->href . '<br>'; 

// Retrieve all images and print their SRCs 
foreach($html->find('img') as $e) 
    echo $e->src . '<br>'; 

// Find all images, print their text with the "<>" included 
foreach($html->find('img') as $e) 
    echo $e->outertext . '<br>'; 

// Find the DIV tag with an id of "myId" 
foreach($html->find('div#myId') as $e) 
    echo $e->innertext . '<br>'; 

// Find all SPAN tags that have a class of "myClass" 
foreach($html->find('div.myClass') as $e) 
    echo $e->outertext . '<br>'; 

// Find all TD tags with "align=center" 
foreach($html->find('td[align=center]') as $e) 
    echo $e->innertext . '<br>'; 

// Extract all text from a given cell 
echo $html->find('td[align="center"]', 1)->plaintext.'<br><hr>'; 
+0

感謝您的幫助,但我需要用正則表達式。 –

+0

更新了答案。 – Noman

+0

謝謝諾曼。這是工作。高超。 –