正則表達式從網頁源提取標題

以下正則表達式適用於我的大部分url。但是在網址很少的情況下，它不會給出標題，儘管源代碼有標題。正則表達式從網頁源提取標題

$data = file_get_contents($url); 
$title = get_title($data); 
echo $title; 
function get_title($html) 
    { 
     return preg_match('!<title>(.*?)</title>!i', $html, $matches) ? $matches[1] : ''; 
    }

以下是演示：DEMO

來源

2013-11-26 user123

我覺得這個問題之前已經在這裏SOF了迴應，選中此http://stackoverflow.com/問題/ 13510124 /正則表達式到頁面標題 –

只是將您的正則表達式更改爲我的回答中提到的正則表達式 – Nishant

從您的演示看來，問題不在於您的正則表達式，而在於獲取頁面首先通過'file_get_contents（）'完成內容。 – ajp15243

作爲解決您的問題，請嘗試執行下面的代碼片段

<?php 
$url= 'http://www.indianic.com'; 
$ch = curl_init(); 
curl_setopt($ch, CURLOPT_URL, $url); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
$result=curl_exec($ch); 
curl_close($ch); 
//echo $result; 
$title = get_title($result); 
echo $title; 
function get_title($html) 
{ 
      return preg_match('!<title>(.*?)</title>!i', $html, $matches) ? $matches[1] : ''; 
} 
?>

來源

2013-11-26 06:06:00

不需要替換file_get_contents來調用curl，regex需要修復 – Nishant

試試這個，

return preg_match('/<title[^>]*>(.*?)<\\/title>/ims', $html, $matches) ? $matches[1] : '';

經過和工作

$url='http://www.ndtv.com/'; 
    $data = file_get_contents($url); 
    $title = preg_match('/<title[^>]*>(.*?)<\/title>/ims', $data, $matches) ? $matches[1] : ''; 
    echo $title;

OUTPUT： - NDTV.com：印度，商業，寶萊塢，板球，視頻和最新新聞

來源

2013-11-26 06:06:41 Nishant

謝謝，但是它再次失敗，因爲'https：// www.facebook.com /' – user123

原因是file_get_contents（）對facebook不起作用，試試這個一個http://stackoverflow.com/questions/7437688/loading-fb-thru-php-file-get-contents-throws-you-are-using-an-incompatible-we – Nishant

正則表達式從網頁源提取標題

回答

相關問題