從推文中過濾和處理url

我正在處理推文並從推文中收集URL。從推文中過濾和處理url

如果URL代表的twitter（即，與t.com或twitter.com開始），然後跳過它
如果鳴叫URL短網址的話，我將其轉換爲長的URL。

CODE：

 if(preg_match($reg_exUrl, $tweet, $url)) { 
       preg_match_all($reg_exUrl, $tweet, $urls); 
       foreach ($urls[0] as $url) { 
       echo "Tiny url : {$url}<br>"; 
       $full = MyURLDecode($url); 
       echo "Full url : $full<br>"; 
       if (strpos($full, '//t.co') === true)     
        continue; 
       if (strpos($full, '//twitter.com') === true)      
       continue; 
       else if (strpos($full, '//bit.ly') !== true)      
        $full = MyURLDecode($full); 
       $url_count = get_twitter_url_count($full); 
       echo "Url count: $url_count";    
       //echo "Numbers of tweets containing this link : ", $code['count']; 
       echo "<br>"; 
       } 
      } else { 
      echo "Mismatch<br>";   
    }   
function MyURLDecode($url)  
    {  
     $ch = @curl_init($url);  
     @curl_setopt($ch, CURLOPT_HEADER, TRUE);  
     @curl_setopt($ch, CURLOPT_NOBODY, TRUE);  
     @curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);  
     @curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);  
     $url_resp = @curl_exec($ch);  
     preg_match('/Location:\s+(.*)\n/i', $url_resp, $i);  
     if (!isset($i[1]))  
     { 

     return $url;  
     }  
     return $i[1];  
    } 

function get_twitter_url_count($url) {  
      $encoded_url = urlencode($url);  
      $content = @file_get_contents('http://urls.api.twitter.com/1/urls/count.json?url=' . $encoded_url);  
      return $content ? json_decode($content)->count : 0; 
     }

問題與此是：

它不會跳過Twitter的URL
有些情況下長的URL是再短的URL，它需要被轉化爲長的網址。但它不能在這裏

來源

2014-01-06 user3121782

＃1，strpos將返回找到的文本的起始位置，不會=== true，所以你需要測試，例如：

strpos($full, '//t.co') !== false

＃2，嘗試在一個while循環中調用MyURLDecode（），例如：

$previous = $full; 
while (($full = MyURLDecode($full)) != $previous) { 
    $previous = $full; 
}

來源

2014-01-06 12:56:40 jedifans

從推文中過濾和處理url

回答

相關問題