2016-04-16 100 views
0

我有一個頁面的XML看起來像:PHP GET IMG SRC從XML

<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"> 
    <channel> 
    <title>FB-RSS feed for Salman Khan Fc</title> 
    <link>http://facebook.com/profile.php?id=1636293749919827/</link> 
    <description>FB-RSS feed for Salman Khan Fc</description> 
    <managingEditor>http://fbrss.com (FB-RSS)</managingEditor> 
    <pubDate>31 Mar 16 20:00 +0000</pubDate> 
    <item> 
     <title>Photo - Who is the Best Khan ?</title> 
     <link>https://www.facebook.com/SalmanKhanFns/photos/a.1639997232882812.1073741827.1636293749919827/1713146978901170/?type=3</link> 
     <description>&lt;a href=&#34;https://www.facebook.com/SalmanKhanFns/photos/a.1639997232882812.1073741827.1636293749919827/1713146978901170/?type=3&#34;&gt;&lt;img src=&#34;https://scontent.xx.fbcdn.net/hphotos-xap1/v/t1.0-0/s130x130/11059765_1713146978901170_8711054263905505442_n.jpg?oh=fa2978c5ecfb3ae424e9082aaa057b8f&amp;oe=57BB41D5&#34;&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;Who is the Best Khan ?</description> 
     <author>FB-RSS</author> 
     <guid>1636293749919827_1713146978901170</guid> 
     <pubDate>31 Mar 16 20:00 +0000</pubDate> 
    </item> 
    <item> 
     <title>Photo</title> 
     <link>https://www.facebook.com/SalmanKhanFns/photos/a.1636293813253154.1073741825.1636293749919827/1713146755567859/?type=3</link> 
     <description>&lt;a href=&#34;https://www.facebook.com/SalmanKhanFns/photos/a.1636293813253154.1073741825.1636293749919827/1713146755567859/?type=3&#34;&gt;&lt;img src=&#34;https://scontent.xx.fbcdn.net/hphotos-xap1/v/t1.0-0/s130x130/12294686_1713146755567859_6728330714340999478_n.jpg?oh=6d90a688fdf4342f9e12e9ff9a66b127&amp;oe=57778068&#34;&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;</description> 
     <author>FB-RSS</author> 
     <guid>1636293749919827_1713146755567859</guid> 
     <pubDate>31 Mar 16 19:58 +0000</pubDate> 
    </item> 
    </channel> 
</rss> 

我想要得到的src S中img S在上述xml

的圖像存儲在<description>但是,它們不是在

<img...

格式,而他們看起來像:

&lt;img src=&#34;https://scontent.xx.fbc...

<被替換爲&lt; ......我想這就是爲什麼$imgs = $dom->getElementsByTagName('img');什麼也沒有返回。

有什麼解決辦法嗎?

這是我怎麼稱呼它:

libxml_use_internal_errors(true); 
$dom = new DOMDocument(); 
$dom->loadXML($xml_file); 
$imgs = ...(get the imgs to extract the src...('img') ??; 

//Then run a possible foreach 
//something like: 

foreach($imgs as $img){ 

    $src= ///the src of the $img 

    //try it out 
    echo '<img src="'.$src.'" /> <br />', 
} 

任何想法?

回答

1

您已將HTML嵌入到XML標記中,因此您必須檢索XML節點,加載每個HTML並檢索所需的標記屬性。

在您的XML中有不同的<description>節點,因此使用->getElementsByTagName將返回超過您所需的節點。使用DOMXPath在正確的樹中的位置只檢索<description>節點:

$dom = new DOMDocument(); 
libxml_use_internal_errors(True); 
$dom->loadXML($xml); 
$dom->formatOutput = True; 

$xpath = new DOMXPath($dom); 
$nodes = $xpath->query('channel/item/description'); 

然後遍歷所有的節點,負荷節點值在新DOMDocument(無需解碼HTML實體,DOM已經解碼,爲你),並提取src<img>節點屬性:

foreach($nodes as $node) 
{ 
    $html = new DOMDocument(); 
    $html->loadHTML($node->nodeValue); 
    $src = $html->getElementsByTagName('img')->item(0)->getAttribute('src'); 
} 

eval.in demo

+0

好極了!似乎工作....謝謝你! – ErickBest