2009-07-13 111 views
2

我一直在用PHP解析XML時遇到了一個問題,而且沒有真正找到「正確的方式」或者至少是解析XML文件的標準方式。使用PHP解析XML

首先我試圖分析此:

<item> 
    <title>2884400</title> 
    <description><![CDATA[ ><img width="126" alt="" src="http://userserve-ak.last.fm/serve/126/27319921.jpg" /> ]]></description> 
    <link>http://www.last.fm/music/+noredirect/Beatles/+images/27319921</link> 
    <author>anne710</author> 
    <pubDate>Tue, 21 Apr 2009 16:12:31 +0000</pubDate> 
    <guid>http://www.last.fm/music/+noredirect/Beatles/+images/27319921</guid> 
    <media:content url="http://userserve-ak.last.fm/serve/_/27319921/Beatles+2884400.jpg" fileSize="13065" type="image/jpeg" expression="full" width="126" height="126" /> 
    <media:thumbnail url="http://userserve-ak.last.fm/serve/126/27319921.jpg" type="image/jpeg" width="126" height="126" /> 
    </item> 

我使用這個代碼:

$doc = new DOMDocument(); 
$doc->load('http://ws.audioscrobbler.com/2.0/artist/beatles/images.rss'); 
$arrFeeds = array(); 
foreach ($doc->getElementsByTagName('item') as $node) { 
    $itemRSS = array ( 
     'title' => $node->getElementsByTagName('title')->item(0)->nodeValue, 
     'desc' => $node->getElementsByTagName('description')->item(0)->nodeValue, 
     'link' => $node->getElementsByTagName('link')->item(0)->nodeValue, 
     'date' => $node->getElementsByTagName('pubDate')->item(0)->nodeValue 
     ); 
    array_push($arrFeeds, $itemRSS); 
} 

現在我想要得到的「媒體:內容」和「媒體:縮略圖「網址屬性,我會怎麼做?現在我想我應該使用DOMElement :: getAttribute,但是我沒有設法使它工作:/任何人都可以對此有所瞭解,並且讓我知道這是否是解析XML的好方法?

問候, 沙迪

+0

這整個問題/線程是非常糟糕的。問題是缺乏對名字空間的理解。我建議任何人閱讀這篇文章瞭解XML名稱空間。人們在下面提到了這一點。問題在於media:內容意味着屬於'media'命名空間的'content'標記,而不是默認的命名空間(這就是你要查詢的內容)。 – Jotham 2010-02-11 02:04:18

回答

1

這是如何我一直在使用的XMLReader最終做到了:

<?php 

define ('XMLFILE', 'http://ws.audioscrobbler.com/2.0/artist/vasco%20rossi/images.rss'); 
echo "<pre>"; 

$items = array(); 
$i = 0; 

$xmlReader = new XMLReader(); 
$xmlReader->open(XMLFILE, null, LIBXML_NOBLANKS); 

$isParserActive = false; 
$simpleNodeTypes = array ("title", "description", "media:title", "link", "author", "pubDate", "guid"); 

while ($xmlReader->read()) 
{ 
    $nodeType = $xmlReader->nodeType; 

    // Only deal with Beginning/Ending Tags 
    if ($nodeType != XMLReader::ELEMENT && $nodeType != XMLReader::END_ELEMENT) { continue; } 
    else if ($xmlReader->name == "item") { 
     if (($nodeType == XMLReader::END_ELEMENT) && $isParserActive) { $i++; } 
     $isParserActive = ($nodeType != XMLReader::END_ELEMENT); 
    } 

    if (!$isParserActive || $nodeType == XMLReader::END_ELEMENT) { continue; } 

    $name = $xmlReader->name; 

    if (in_array ($name, $simpleNodeTypes)) { 
     // Skip to the text node 
     $xmlReader->read(); 
     $items[$i][$name] = $xmlReader->value; 
    } else if ($name == "media:thumbnail") { 
     $items[$i]['media:thumbnail'] = array (
       "url" => $xmlReader->getAttribute("url"), 
       "width" => $xmlReader->getAttribute("width"), 
       "height" => $xmlReader->getAttribute("height"), 
       "type" => $xmlReader->getAttribute("type") 
     ); 
    } else if ($name == "media:content") { 
     $items[$i]['media:content'] = array (
       "url" => $xmlReader->getAttribute("url"), 
       "width" => $xmlReader->getAttribute("width"), 
       "height" => $xmlReader->getAttribute("height"), 
       "filesize" => $xmlReader->getAttribute("fileSize"), 
       "expression" => $xmlReader->getAttribute("expression") 
     ); 
    } 
} 

print_r($items); 
echo "</pre>"; 

?> 
0

你會想是這樣的:

'content' => $node->getElementsByTagNameNS('http://search.yahoo.com/mrss/', 'content')->item(0)->getAttribute('url'); 
'thumbnail' => $node->getElementsByTagNameNS('http://search.yahoo.com/mrss/', 'thumbnail')->item(0)->getAttribute('url'); 

我相信會的工作,它已經有一段時間,因爲我做了這樣的事。

+0

那麼如何實現呢? ! – 2009-07-13 20:57:06

+0

這是不是工作? – 2009-07-13 21:13:55

+0

[Mon Jul 13 23:13:04 2009] [error] [client xxx.xxx.xxx.xxx] PHP致命錯誤:在/ v2中的非對象上調用成員函數getAttribute()。73行上的php – 2009-07-13 21:21:34

0
<?php 

#Convert the String Into XML 
$xml = new SimpleXMLElement($_POST['name']); 

#Itterate through the XML for the data 

$values = "VALUES('' , "; 
foreach($xml->item as $item) 
{ 
//you now have access to that aitem 
} 

?> 
3

可以使用SimpleXML通過其他海報的建議,但你需要使用兒童()和屬性()函數,所以你可以deal with the different namespaces

例(未經測試):

$feed = file_get_contents('http://ws.audioscrobbler.com/2.0/artist/beatles/images.rss'); 
$xml = new SimpleXMLElement($feed); 
foreach ($xml->channel->item as $item) { 
    foreach ($item->children('http://search.yahoo.com/mrss' as $media_element) { 
     var_dump($media_element); 
    } 
} 

或者,您可以使用XPath(再次,未經測試):

$feed = file_get_contents('http://ws.audioscrobbler.com/2.0/artist/beatles/images.rss'); 
$xml = new SimpleXMLElement($feed); 
$xml->registerXPathNamespace('media', 'http://ws.audioscrobbler.com/2.0/artist/beatles/images.rss'); 
$images = $xml->xpath('/rss/channel/item/media:[email protected]'); 
var_dump($images); 
1

試試這個。它會正常工作。

$doc = new DOMDocument(); 
$doc->load('http://ws.audioscrobbler.com/2.0/artist/beatles/images.rss'); 
$arrFeeds = array(); 
foreach ($doc->getElementsByTagName('item') as $node) { 
    $itemRSS = array ( 
     'title' => $node->getElementsByTagName('title')->item(0)->nodeValue, 
     'desc' => $node->getElementsByTagName('description')->item(0)->nodeValue, 
     'link' => $node->getElementsByTagName('link')->item(0)->nodeValue, 
     'date' => $node->getElementsByTagName('pubDate')->item(0)->nodeValue, 
     'thumbnail' => $node->getElementsByTagName('thumbnail')->item(0)->getAttribute('url') 
    ); 
    array_push($arrFeeds, $itemRSS); 
} 
0

如果飼料中缺少像thumbnail條目您可能會收到錯誤Call to a member function getAttribute() on a non-object,因此,雖然我很喜歡@Helder羅巴洛的答案,你應該檢查,以確保節點試圖用之類的東西getAttribute()之前就存在:

<?php 

header('Content-type: text/plain; charset=utf-8'); 

$doc = new DOMDocument(); 
$doc->load('http://ws.audioscrobbler.com/2.0/artist/beatles/images.rss'); 
$arrFeeds = array(); 
foreach ($doc->getElementsByTagName('item') as $node) { 
    $itemRSS = array (
     'title' => $node->getElementsByTagName('title')->item(0)->nodeValue, 
     'desc' => $node->getElementsByTagName('description')->item(0)->nodeValue, 
     'link' => $node->getElementsByTagName('link')->item(0)->nodeValue, 
     'date' => $node->getElementsByTagName('pubDate')->item(0)->nodeValue 
    ); 

    if(sizeof($node->getElementsByTagName('thumbnail')->item(0)) > 0) 
    { 
     $itemRSS['thumbnail'] = $node->getElementsByTagName('thumbnail')->item(0)->getAttribute('url'); 
    } 
    else 
    { 
     $itemRSS['thumbnail'] = ''; 
    } 

    array_push($arrFeeds, $itemRSS); 
} 


print_r($arrFeeds); 
0

媒體:內容屬性實際上是非常容易得到與簡單的XML

if([email protected]$x=simplexml_load_file($feed_url)){ 

} 
else 
{ 
    foreach($x->channel->item as $entry) 
    { 
    $media = $entry->children('http://search.yahoo.com/mrss/')->attributes(); 
    $url = (string) $media['url']; 
    } 
}