2015-09-13 19 views
0

我有一個問題,即時嘗試從外部URL拉基本的元數據,我已成功地讓它這樣做的大部分,但它造成幾個字符問題上的字母是Ääö是出現像mäenjaksa7-300x200.jpg當我打電話的圖片網址,其實是mäenjaksa7-300x200.jpg,我的代碼如下,並感謝您的幫助。從php,奇怪的字符拉url元素?

function file_get_contents_curl($url) 
{ 
$ch = curl_init(); 

curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_URL, $url); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 

$data = curl_exec($ch); 
curl_close($ch); 

return $data; } 

$html = file_get_contents_curl($params['url']); 

//parsing begins here: 
$doc = new DOMDocument(); 
@$doc->loadHTML($html); 
$nodes = $doc->getElementsByTagName('title'); 

//get and display what you need: 
$urltitle = $nodes->item(0)->nodeValue; 

$metas = $doc->getElementsByTagName('meta'); 

for ($i = 0; $i < $metas->length; $i++) 
{ 
$meta = $metas->item($i); 
if($meta->getAttribute('name') == 'description') 
    $description = $meta->getAttribute('content'); 
if($meta->getAttribute('name') == 'keywords') 
    $keywords = $meta->getAttribute('content'); 
if($meta->getAttribute('property') == 'og:image') 
    $ogimage = $meta->getAttribute('content'); 
if($meta->getAttribute('rel') == 'image_src') 
    $relimage = $meta->getAttribute('content'); 
} 

if(empty($ogimage)) { 
$metaimage = $relimage; 
} else { 
$metaimage = $ogimage; 
} 

回答

0

解決方案: 添加此下方 查找:

$html = file_get_contents_curl($url); 

添加beow它:

//Change encoding to UTF-8 from ISO-8859-1 
    $html = iconv('UTF-8', 'ISO-8859-1//TRANSLIT', $html); 
1

也許你必須確保你的網址頭有content-type - >charsetUTF-8或合適的一個。您必須確保您的網址不是內容無Ascii字符或確保您已正確設置適當的「字符編碼器」。也許我還沒有很好的理解你的問題,但是看看這個例子沒有涉及您的代碼,但可能是有用的:

$url = "http://www.example.com/services/calculation"; 
    $page = "/services/calculation"; 
    $headers = array( 
     "POST ".$page." HTTP/1.0", 
     "Content-type: text/xml;charset=\"utf-8\"", 
     "Accept: text/xml", 
     "Cache-Control: no-cache", 
     "Pragma: no-cache", 
     "SOAPAction: \"run\"", 
     "Content-length: ".strlen($xml_data), 
     "Authorization: Basic " . base64_encode($credentials) 
    ); 

    $ch = curl_init(); 
    curl_setopt($ch, CURLOPT_URL,$url); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($ch, CURLOPT_TIMEOUT, 60); 
    curl_setopt($ch, CURLOPT_HTTPHEADER, $headers); 
    curl_setopt($ch, CURLOPT_USERAGENT, $defined_vars['HTTP_USER_AGENT']); 
+0

您好感謝您的回覆,我發現類似您,我增加, //更改編碼設置爲UTF-8的解決方案ISO-8859-1 $ html = iconv('UTF-8','ISO-8859-1 // TRANSLIT',$ html);在$ html = file_get_contents_curl($ url)下的 ; 現在我只需要弄清楚爲什麼我不能調用rel =「image_src」標籤 – David