2016-08-22 89 views
0

我想創建一個簡單的RSS飼料網站。
我可以得到一些RSS源通過只是在做這樣的:大多數RSS的如何從xml文件中獲取<img> src值?

let article = { 
       'title': item.title, 
       'image': item.image.url, 
       'link': item.link, 
       'description': item.description, 
      } 

標題和鏈接工作飼料,但圖像和說明不。
由於大量的RSS費有這樣的描述的內部形象爲HTML:

{ title: 'The Rio Olympics Are Where TV Finally Sees the Future', 
description: '<div class="rss_thumbnail"><img src="http://www.wired.com/wp-content/uploads/2016/08/GettyImages-587338962-660x435.jpg" alt="The Rio Olympics Are Where TV Finally Sees the Future" /></div>Time was, watching the Olympics just meant turning on your TV. That\'s changed—and there\'s no going back. The post <a href="http://www.wired.com/2016/08/rio-olympics-tv-finally-sees-future/">The Rio Olympics Are Where TV Finally Sees the Future</a> appeared first on <a href="http://www.wired.com">WIRED</a>.',... 

我怎樣才能獲得圖像的URL從它?

編輯:

http.get("http://www.wired.com/feed/"... 

    .on('readable', function() { 
     let stream = this; 
     let item; 
     while(item = stream.read()){ 
      let article = { 
       'title': item.title, 
       'image': item.image.url, 
       'link': item.link, 
       'description': item.description, 
      } 
      news.push(article); 
     } 
    }) 

這是我的一些代碼,基本上我試圖擺脫有線RSS圖像的URL。
如果我用戶'圖像':item.image.url,它不起作用。那麼我應該怎樣改變它呢?

回答

1

使用xml2js轉換XML到JSON

var parseString = require('xml2js').parseString; 

var xml = '<img title=\'A San Bernardino County Fire Department firefighter watches a helitanker make a water drop on a wildfire, seen from Cajon Boulevard in Devore, Calif., Thursday, Aug. 18, 2016. (David Pardo/The Daily Press via AP)\' height=\'259\' alt=\'APTOPIX California Wildfires\' width=\'460\' src=\'http://i.cbc.ca/1.3730399.1471835992!/cpImage/httpImage/image.jpg_gen/derivatives/16x9_460/aptopix-california-wildfires.jpg\' />'; 

parseString(xml, function (err, result) { 
    console.log(JSON.stringify(result, null, 4)); 
    console.log(result["img"]["$"]["src"]); 
}); 
+0

我試過你的答案,但沒有奏效。我編輯並添加了一些我的代碼。 – Dan

+0

@Dan對不起,你的代碼失敗了....代碼失敗了......它應該從字符串中獲取'url'...你能告訴你在它裏面做了什麼改變.... –

-1

您可以使用DOMDocument解析器來獲取圖像源。

$html = "<img title=\'A San Bernardino County Fire Department firefighter watches a helitanker make a water drop on a wildfire, seen from Cajon Boulevard in Devore, Calif., Thursday, Aug. 18, 2016. (David Pardo/The Daily Press via AP)\' height=\'259\' alt=\'APTOPIX California Wildfires\' width=\'460\' src=\'http://i.cbc.ca/1.3730399.1471835992!/cpImage/httpImage/image.jpg_gen/derivatives/16x9_460/aptopix-california-wildfires.jpg\' />"; 

$doc = new DOMDocument(); 
$doc->loadHTML($html); 
$xpath = new DOMXPath($doc); 
$src = $xpath->evaluate("string(//img/@src)"); # "/images/image.jpg" 
+0

哪裏OP說希望PHP? –

0

字符串的正則表達式:

var res = description.match(/src=.*\.(jpg|jpeg|png|gif)/gi); 

Fiddle Demo

+0

我試過你的答案,但沒有奏效。我編輯並添加了一些我的代碼。 – Dan

0

一個想法是使用正則表達式。對於前:

var re = /(src=)(\\'htt.*\\')/g 
var img_string = "your image tag string" 
var match = re.exec(img_string) 
var result = match[1]