2009-07-17 322 views
5

現在,我一直在尋找一個代碼來使用PHP從URL中獲取URL。我基本上試圖從一條消息中獲取一個縮短的URL,然後再做一個HEAD請求來查找實際的鏈接。從字符串獲取URL

任何人都有任何代碼從字符串返回URL?

在此先感謝。

編輯爲鬼狗:

這裏是我解析的樣本:

$test = "I am testing this application for http://test.com YAY!"; 

,這裏是我得到的迴應是解決它:

$regex = '$\b(https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[-A-Z0-9+&@#/%=~_|]$i'; 

preg_match_all($regex, $string, $result, PREG_PATTERN_ORDER); 
$A = $result[0]; 

foreach($A as $B) 
{ 
    $URL = GetRealURL($B); 
    echo "$URL<BR>";  
} 


function GetRealURL($url) 
{ 
    $options = array(
     CURLOPT_RETURNTRANSFER => true, 
     CURLOPT_HEADER   => true, 
     CURLOPT_FOLLOWLOCATION => true, 
     CURLOPT_ENCODING  => "", 
     CURLOPT_USERAGENT  => "spider", 
     CURLOPT_AUTOREFERER => true, 
     CURLOPT_CONNECTTIMEOUT => 120, 
     CURLOPT_TIMEOUT  => 120, 
     CURLOPT_MAXREDIRS  => 10, 
    ); 

    $ch  = curl_init($url); 
    curl_setopt_array($ch, $options); 
    $content = curl_exec($ch); 
    $err  = curl_errno($ch); 
    $errmsg = curl_error($ch); 
    $header = curl_getinfo($ch); 
    curl_close($ch); 
    return $header['url']; 
} 

詳細信息請參閱答案。

+0

如何展示你的一個例子重新解析 – ghostdog74 2009-07-17 23:58:46

回答

10

此代碼可能是有幫助的(見MadTechie的最新帖子):

http://www.phpfreaks.com/forums/index.php/topic,245248.msg1146218.html#msg1146218

<?php 
$string = "some random text http://tinyurl.com/9uxdwc some http://google.com random text http://tinyurl.com/787988"; 

$regex = '$\b(https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[-A-Z0-9+&@#/%=~_|]$i'; 

preg_match_all($regex, $string, $result, PREG_PATTERN_ORDER); 
$A = $result[0]; 

foreach($A as $B) 
{ 
    $URL = GetRealURL($B); 
    echo "$URL<BR>"; 
} 


function GetRealURL($url) 
{ 
    $options = array(
     CURLOPT_RETURNTRANSFER => true, 
     CURLOPT_HEADER   => true, 
     CURLOPT_FOLLOWLOCATION => true, 
     CURLOPT_ENCODING  => "", 
     CURLOPT_USERAGENT  => "spider", 
     CURLOPT_AUTOREFERER => true, 
     CURLOPT_CONNECTTIMEOUT => 120, 
     CURLOPT_TIMEOUT  => 120, 
     CURLOPT_MAXREDIRS  => 10, 
    ); 

    $ch  = curl_init($url); 
    curl_setopt_array($ch, $options); 
    $content = curl_exec($ch); 
    $err  = curl_errno($ch); 
    $errmsg = curl_error($ch); 
    $header = curl_getinfo($ch); 
    curl_close($ch); 
    return $header['url']; 
} 

?> 
+0

是的,那正是我所需要的 – 2009-07-18 00:12:56

2

喜歡的東西:

$matches = array(); 
preg_match_all('/http:\/\/[a-zA-Z0-9.-]+\/[a-zA-Z0-9.-]+/', $text, $matches); 
print_r($matches); 

你需要調整正則表達式來得到你想要的東西。

要獲得URL時,考慮簡單的東西如:

curl -I http://url.com/path | grep Location: | awk '{print $2}'

+0

不需要grep:curl -I http://url.com/path | awk'/ Location/{print $ 2}' – ghostdog74 2009-07-18 00:19:08