2017-08-14 65 views
0

我需要幫助從description標籤獲取數據,它由<a>,<img>和一些文本組成。我試圖解析XML是this如何解析html內部的xml標籤

我設法得到我需要的所有數據,除了description標籤在那裏我得到<a>標籤與說明文字一起。我需要的是img的src和描述文字。

我的代碼:

foreach ($rss->getElementsByTagName('item') as $node) { 
     /*$test = $node->getElementsByTagName('description'); 
     $test = $test->item(0)->textContent;*/ 
     var_dump($test); 
     exit; 
     $nodes = $node->getElementsByTagName('content'); 


     if(!is_object($nodes) || $nodes === null || $nodes->length==0){ 

       $linkthumbNode = $node->getElementsByTagName('image'); 


       if(isset($linkthumbNode) && $linkthumbNode->length >0){ 
         $linkthumb=$linkthumbNode->item(0)->nodeValue; 

         if(empty($linkthumb)||$linkthumb == " "){ 


          $linkthumb = $linkthumbNode->item(0)->getAttribute('src'); 

         } 

        }else{ 

         $linkthumb = "NO IMAGE"; 
       } 

     }else{ 

      $linkthumb = $nodes->item(0)->getAttribute('url'); 
     } 

     $title = $node->getElementsByTagName('title')->item(0)->nodeValue; 
     $desc = $node->getElementsByTagName('description')->item(0)->textContent; 
     $link = $node->getElementsByTagName('link')->item(0)->nodeValue; 
     $img = $linkthumb; 
     $date = $node->getElementsByTagName('pubDate'); 
     if(isset($date) && $date->length >0){ 
      $date = $date->item(0)->nodeValue; 
     }else{ 
      $date = "no date provided"; 

     } 


     $item = array ( 
      'title' => $title, 
      'desc' => $desc, 
      'link' => $link, 
      'img' => $img, 
      'date' => $date, 
      ); 
     array_push($feed, $item); 
    } 

XML描述標籤是:

<description> 
<a href="http://timesofindia.indiatimes.com/life-style/health-fitness/diet/9-food-combos-to-make-you-lean/articleshow/20984744.cms"><img border="0" hspace="10" align="left" style="margin-top:3px;margin-right:5px;" src="http://timesofindia.indiatimes.com/photo/20984744.cms" /></a>Nine food combinations that will make staying healthy and looking fit easier 
</description> 

我需要什麼:http://timesofindia.indiatimes.com/photo/20984744.cms圖像和Nine food combinations that will make staying healthy and looking fit easier作爲我的描述。

有人可以幫助我嗎?我並不擅長PHP和解析XML。

回答

0

也許我對派對有點遲到,但如果仍然需要回答,請查看我的解決方案。我使用PHP DOMDocument和正則表達式,因爲我還沒有找到一種簡單的方法來僅使用XML擴展獲取所需的數據。

$rss = file_get_contents('https://timesofindia.indiatimes.com/rssfeeds/2886704.cms'); 
$feed = new DOMDocument(); 
$feed->loadXML($rss); 

$items = array(); 

foreach($feed->getElementsByTagName('item') as $item) { 
    $arr = array(); 
    foreach($item->childNodes as $child) { 
     if($child->nodeName === 'title' || $child->nodeName === 'link') $arr[$child->nodeName] = $child->nodeValue; 
     if($child->nodeName === 'pubDate') $arr['date'] = $child->nodeValue; 
     if($child->nodeName === 'description') { 
      preg_match('/(?<=src=[\'\"])(.+)(?=[\'\"])/i', $child->nodeValue, $matches); 
      $arr['img'] = $matches[0]; 
      preg_match('/[^>]+$/i', $child->nodeValue, $matches); 
      $arr['desc'] = $matches[0]; 
     } 
    } 
    array_push($items, $arr); 
} 
print_r($items); 

輸出是這樣的,似乎是你需要的東西:

Array ([0] => Array ([title] => 5 reasons you get sore after sex [img] => https://timesofindia.indiatimes.com/photo/61101815.cms [desc] => Sometimes, a super-filmy, almost-perfect sex leaves you all euphoric but only to end with soreness later. So, what is it that is going wrong? Can it be remedied? [link] => https://timesofindia.indiatimes.com/life-style/health-fitness/health-news/5-reasons-you-get-sore-after-sex/life-style/health-fitness/health-news/5-reasons-you-get-sore-after-sex/photostory/61101724.cms [date] => Mon, 16 Oct 2017 10:21:27 GMT)...