2017-05-30 214 views
2

我正嘗試使用php閱讀RSS源。由於某些原因,它無法讀取此內容標籤。php閱讀RSS源無法閱讀<a10:content type =「text/xml」>標記

<a10:content type="text/xml">...</a10:content> 

這是一個什麼樣的項目可能看起來像

<rss version="2.0" xmlns:a10="http://www.w3.org/2005/Atom"> 
    <channel> 
     <title>mMin title</title> 
     <description>Some description</description> 
     <managingEditor>[email protected]</managingEditor> 
     <category>Some category</category> 
     <item> 
      <guid isPermaLink="false">1</guid> 
      <link>https://example.com/1</link> 
      <title>Some title 1</title> 
      <a10:updated>2017-05-30T13:20:22+02:00</a10:updated> 
      <a10:content type="text/xml"> 
       <Location>San diego</Location> 
       <PublishedOn>2016-10-21T11:21:07</PublishedOn> 
       <Body>Lorem ipsum dolar</Body> 
       <JobCountry>USA</JobCountry> 
      </a10:content> 
     </item> 
     <item> 
      <guid isPermaLink="false">1</guid> 
      <link>https://example.com/2</link> 
      <title>Some title 2</title> 
      <a10:updated>2017-05-30T13:20:22+02:00</a10:updated> 
      <a10:content type="text/xml"> 
       <Location>Detroit</Location> 
       <PublishedOn>2016-10-21T11:21:07</PublishedOn> 
       <Body>Lorem ipsum dolar</Body> 
       <JobCountry>USA</JobCountry> 
      </a10:content> 
     </item> 
     <item> 
      <guid isPermaLink="false">1</guid> 
      <link>https://example.com/3</link> 
      <title>Some title 3</title> 
      <a10:updated>2017-05-30T13:20:22+02:00</a10:updated> 
      <a10:content type="text/xml"> 
       <Location>Los Angeles</Location> 
       <PublishedOn>2016-10-21T11:21:07</PublishedOn> 
       <Body>Lorem ipsum dolar</Body> 
       <JobCountry>USA</JobCountry> 
      </a10:content> 
     </item> 
    </channel> 
</rss> 

這裏是我的代碼的例子。

$url = "http://example.com/RSSFeed"; 
    $xml = simplexml_load_file($url); 

    foreach ($xml->channel as $x) { 
     foreach ($x->item as $item) { 

      dd($item); 
     } 
    } 

,輸出

SimpleXMLElement {#111 ▼ 
     +"guid": "1" 
     +"link": "https://example.com" 
     +"title": "Some title" 
    } 

這是我期望的輸出

SimpleXMLElement {#111 ▼ 
    +"guid": "1" 
    +"link": "https://example.com" 
    +"title": "Some title" 
    +"content" { 
    0 => { 
     +"Location": "San Diego" 
     +"PublishedOn": "2016-10-21T11:21:07" 
     +"Body": "Lorem ipsum dolar" 
     +"JobCountry": "USA" 
    } 
    1 => { 
     +"Location": "Detroit" 
     +"PublishedOn": "2016-10-21T11:21:07" 
     +"Body": "Lorem ipsum dolar" 
     +"JobCountry": "USA" 
    } 
    2 => { 
     +"Location": "Los Angeles" 
     +"PublishedOn": "2016-10-21T11:21:07" 
     +"Body": "Lorem ipsum dolar" 
     +"JobCountry": "USA" 
    } 
    } 
} 

任何人有一個解決方案?

+0

您完整的XML? –

+0

@SahilGulati我更新了XML –

回答

1

您應該使用命名空間進行訪問。這裏我們使用DOMDocument來實現所需的輸出。 DOMDocument功能getElementsByTagNameNS,在此我們通過namespace uri及其所需內容。這樣可以達到預期的產出。

如果你喜歡使用simplexml_load_string你可以檢查一下。 PHP code demo

Try this code snippet here

<?php 

ini_set('display_errors', 1); 

libxml_use_internal_errors(true); 
$string=<<<HTML 
<rss version="2.0" xmlns:a10="http://www.w3.org/2005/Atom"> 
    <channel> 
     <title>mMin title</title> 
     <description>Some description</description> 
     <managingEditor>[email protected]</managingEditor> 
     <category>Some category</category> 
     <item> 
      <guid isPermaLink="false">1</guid> 
      <link>https://example.com</link> 
      <title>Some title</title> 
      <a10:updated>2017-05-30T13:20:22+02:00</a10:updated> 
      <a10:content type="text/xml"> 
       <Location>Detroit</Location> 
       <PublishedOn>2016-10-21T11:21:07</PublishedOn> 
       <Body>Lorem ipsum dolar</Body> 
       <JobCountry>USA</JobCountry> 
      </a10:content> 
     </item> 
    </channel> 
</rss> 
HTML; 
$data=array(); 
$completeData=array(); 
$domDocument = new DOMDocument(); 
$domDocument->loadXML($string); 
$results=$domDocument->getElementsByTagNameNS("http://www.w3.org/2005/Atom", "content"); 
foreach($results as $result) 
{ 
    if($result instanceof DOMElement && $result->tagName=="a10:content") 
    { 
     foreach($result->childNodes as $node) 
     { 
      if($node instanceof DOMElement) 
      { 
       $data[]=$node->nodeValue; 
      } 
     } 
    } 
    $completeData[]=$data; 
} 
print_r($completeData); 
+1

很好的答案,除非你沒有解釋他的問題是什麼。 – delboy1978uk

+0

@ delboy1978uk當然,我解釋它 –

+0

@SahilGulati這裏的問題是,我需要它作爲一個數組與幾個項目,而不是鍵值對。 –

0

首先,不要使用簡單的XML,它是扯淡!使用DOMDocument會更好。

http://php.net/manual/en/class.domdocument.php

<?php 

$dom = new DOMDocument(); 
$dom->loadXML($xml); 


$items = $dom->getElementsByTagName('item'); 
$array = array(); 

foreach($items as $item) 
{ 
    $title = $item->getElementsByTagName('title')->item(0)->nodeValue; 
    $link = $item->getElementsByTagName('link')->item(0)->nodeValue; 
    $updated = $item->getElementsByTagName('updated')->item(0)->nodeValue; 
    $location = $item->getElementsByTagName('Location')->item(0)->nodeValue; 
    $pub = $item->getElementsByTagName('PublishedOn')->item(0)->nodeValue; 
    $body = $item->getElementsByTagName('Body')->item(0)->nodeValue; 
    $job = $item->getElementsByTagName('JobCountry')->item(0)->nodeValue; 

    $array[] = [ 
     'title' => $title, 
     'link' => $link, 
     'updated' => $updated, 
     'Location' => $location, 
     'PublishedOn' => $pub, 
     'Body' => $body, 
     'JobCountry' => $job, 
    ]; 
} 

var_dump($array); 

這將gvie ytou這樣的:

array(7) { ["title"]=> string(12) "Some title 1" ["link"]=> string(21) "https://example.com/1" ["updated"]=> string(25) "2017-05-30T13:20:22+02:00" ["Location"]=> string(9) "San diego" ["PublishedOn"]=> string(19) "2016-10-21T11:21:07" ["Body"]=> string(17) "Lorem ipsum dolar" ["JobCountry"]=> string(3) "USA" } 

看這裏! https://3v4l.org/E0UXJ

現在它的工作原理,讓我們通過創建一個方便的功能優化它:

function domToArray($item, array $cols) 
{ 
    $array = []; 
    foreach ($cols as $col) { 
     $val = $item->getElementsByTagName($col)->item(0)->nodeValue; 
     $array[$col] = $val; 
    } 
    return $array; 
} 

$dom = new DOMDocument(); 
$dom->loadXML($xml); 

$items = $dom->getElementsByTagName('item'); 
$array = array(); 

$fields = [ 
     'title', 
     'link', 
     'updated', 
     'Location', 
     'PublishedOn', 
     'Body', 
     'JobCountry', 
    ]; 

foreach($items as $item) 
{ 
    $array[] = domToArray($item, $fields); 
} 

var_dump($array); 

的輸出結果相同,在這裏看到https://3v4l.org/W6HM3

+0

@ delboy1987uk有幾個項目,我需要他們作爲一個數組。 –

+0

我想每個項目作爲一個對象。並非所有的東西都是平面陣列。 –

+0

正在更新!支持 :-) – delboy1978uk

1

這裏是你可以分享我的工作液

$xml = file_get_contents("https://example.com/RSSFeed"); 

$string = str_replace(array("<a10:content","</a10:content>"), array("<content","</content>"), $xml); 

$sxe = new \SimpleXMLElement($string); 

$jobs = array(); 

foreach ($sxe as $item) { 

    dd($item); 

}