我需要使用網址的購物網站的標題和元標籤和圖像標籤。 這是我的代碼,它使用亞馬遜產品鏈接。但它不起作用的網址如:外部網站的標題和Meta標籤
- http://www.alternate.de/Synology/Synology+DS413,_NAS/html/product/1028780/?
- http://www.bonprix.de/produkt/baby-fleecejacke-hellgrau-meliert-958416/
我爲獲得標籤代碼:
$url ="http://rads.stackoverflow.com/amzn/click/B009T9QCWI";
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
curl_close($ch);
$returned_content = $data;
$doc = new \DOMDocument();
@$doc->loadHTML($returned_content);
$nodes = $doc->getElementsByTagName("title");
//$title = $nodes->item(0)->nodeValue;
$product_title = str_replace("'", " ", $title);
$xml=simplexml_import_dom($doc);
$images=$xml->xpath("//img");
$j=0;
foreach($images as $img) {
$host = explode(":",$img["src"]);
$ht = $host[0];
if ($ht == "http" || $ht == "https") {
$info = pathinfo($img["src"]);
if (array_key_exists('extension', $info)) {
$extension = $info["extension"];
}
if ($extension == "jpg" || $extension == "jpeg") {
$imagesrc[] = $img["src"];
$j++;
$image[] = $img["src"] ;
}
}
}
$metas = $doc->getElementsByTagName('meta');
for ($i = 0; $i < $metas->length; $i++) {
$meta = $metas->item($i);
if ($meta->getAttribute('name') == 'description' || $meta->getAttribute('name') == 'Description') {
$description = $meta->getAttribute('content');
}
if ($meta->getAttribute('name') == 'keywords') {
$keywords = $meta->getAttribute('content');
}
}
if (empty($image)) {
$domarray[] = array('desc' => $description, 'title'=>$product_title);
print_r($domarray);
} else {
$domarray[] = array('img' =>$image, 'desc' => $description, 'title'=>$product_title);
print_r($domarray) ;
}
您試圖將HTML解析爲XML。請不要(http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) 我認爲有一個API的亞馬遜,可以讓你的信息以更可解析的方式(https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html)。 – ToBe
@ToBe請注意,使用'loadHTML()'。完全可行 – hek2mgl
謝謝你。但我的目標不僅是amazon.Amazon是這個代碼的作品。 –