2013-11-26 49 views
2

我正在開發一個自動化系統,其中cron作業自動從RSS提要中獲取新鮮內容,然後將它們存儲到數據庫中供以後使用...(可能是用作WP後)...如何從rss URL獲取所有內容

所有工作都很好,但唯一的問題是我得到的只是很小的描述。我想從RSS Feed中提取完整的帖子內容?不僅僅是節選。

IM使用由我使用WordPress沒有笨

PHP代碼

RSSLink = http://feeds.feedburner.com/learnhack

$rss = fetch_feed($entry->rss_link); 
    foreach ($rss->get_items() as $item) 
    { 
     var_dump($item); 
     $page_content = array(
      'post_title' => $item->get_title(), 
      'post_content' => $item->get_description(), 
      ); 
     // Data base insert statements 
    } 

輸出:

SimplePie_Item對象([進料] =>了SimplePie對象([data] => Array([child] => Array([] => Array([rss] => Array([0] => Ar ray([data] => [attribs] => Array([] => Array)[xml_base] [] => Array([channel] => Array([0] => Array([data] => [attribs] => Array()[xml_base] => [xml_base_explicit] => [xml_lang] => [child ] => Array([] => Array([title] => Array([0] => Array([data] =>學習道德黑客的基礎知識[attribs] => Array()[xml_base] =>數組] => [數組] =>http://www.basicsofhacking.com/ [attribs] =>數組([xml_base] => [xml_base_explicit] =>瞭解道德黑客技術:什麼是黑客,電子郵件黑客,系統黑客,網站黑客,Facebook黑客,谷歌黑客等等。 。[attribs] => Array()[xml_base] => [xml_base_explicit] => [xml_lang] =>))[language] => Array([0] => Array([data] => en [attribs] = > Array()[xml_base] => [xml_base_explicit] => [x ml_lang] =>))[managingEditor] => Array([0] => Array([data] => [email protected](Harwinder Kumar)[attribs] => Array()[xml_base] => [xml_base_explicit] => [xml_lang] =>)[lastBuildDate] =>數組([0] =>數組([數據] =>太陽,2013年11月24日08:25:03 PST [attribs] => Array()[xml_base] => [xml_base_explicit] => [xml_lang] =>))[generator] => Array([0] => Array([data] => Blogger http://www.blogger.com [attribs] => Array()[xml_base] => [xml_base_explicit ] => [xml_lang] => [xml_lang] => [xml_base] => Array [] => [array] => [] => Array [] => Array([link] => Array([0] => Array([data] =>http://creativecommons.org/licenses/by/3.0/ [attribs] => Array()[xml_base] => [xml_base_explicit ] => [xml_lang] => [xml_base] => [xml_base_explicit] => [xml_lang => [xml_lang] =>)[] => Array([data] =>http://creativecommons.org/images/public/somerights20.gif [attribs] => ] =>))[title] => Array([0] => Array([data] =) >一些權利保留[attribs] => Array()[xml_base] => [xml_base_explicit] => [xml_lang] =>))))))[item] => Array([0] => Array([data] => Array [] => Array([title] => Array([0] =>)[=> [attribs] => Array()[xml_base] => [xml_base_explicit] => [xml_lang] => [child]數組([數據] => WordPress的安全:從黑客保護網站/未來的攻擊[attribs] => Array()[xml_base] => [xml_base_explicit] => [xml_lang] =>))[link] => Array([ 0] => Array([data] =>http://feedproxy.google.com/~r/learnhack/~3/nSMFsPWxWQQ/wordpress-security-securing-sites-from-hackers.html [attribs] => Array()[xml_base] => [xml_base_explicit] => [xml_lang] =>))[category] ​​=> Array([0] => Array ([data] =>基本的道德黑洞[attribs] => Array()[xml_base] => [xml_base_explicit] => [xml_lang] =>)[1] => Array([data] => WORDPRESS TRICKS [attribs ] => Array()[xml_base] => [xml_base_explicit] => [xml_lang] =>))[author] => Array([0] => Array([data] => [email protected](Harwinder Kumar )[attribs] => Array()[xml_base] => [xml_base_exp (ArrayList)[] [] [] [] [] [] [] [] [] []數組([0] =>數組([data] =>標籤:blogger.com,1999:blog-8198217290464183069)。在數組([] =>數組([isPermaLink] => false))[xml_base] => [xml_base_explicit] => [xml_lang] =>))[description] => Array([0 ] => Array([data] =>由於WordPress是網絡上最流行的CMS,但如果我們沒有遵循必要的安全措施,也容易受到威脅。在之前的一篇客串文章中,Sarah Rexman提到了一些關於保護WordPress,並在這篇文章中,我將分享我自己的經驗。作爲自由職業者在oDesk,Elance和Freelancer上工作;客戶總是有問題關於從黑客保護他們的網站,並詢問如何防止...

[attribs] => Array()[xml_base] => [xml_base_explicit] => [xml_lang] =>)))[http://search.yahoo.com/mrss/] => Array([thumbnail] => Array ([0] => Array([data] => [attribs] => Array([] => Array([url] =>http://2.bp.blogspot.com/-meTNpj8B758/UOR5j1OmE5I/AAAAAAAAAvk/UtCMCLa_C3Q/s72-c/WordPress+Security.jpg [高] => 72 [寬度] => 72))

此外,我想了RSS後的圖像存儲在我的服務器上,沒有盜鏈

+0

你如何使用var_dump($ item),看看裏面有什麼數據,根據你可以拉你想要的任何東西 –

+0

丫我試過一樣,只獲取內容get_description – vs7

+0

需要更多的細節,你從RSS feed –

回答

1

RSS提要的描述只是筆者決定什麼包含放在那裏。這可能是一篇完整的文章,但也可能只是對某些事物的總結。

如果您需要完整的文章,我認爲您幾乎堅持通過link元素中的URL獲取任何內容。

對於圖像和其他媒體文件,您可能需要解析HTML,然後手動下載每個元素。且不說重寫所有的路徑...祝你好運......


另外,如果刮抄襲別人的博客爲自己不管三七二十一就是你在做什麼,那麼請只是停止...

+0

我有我的5-6博客我的博客,所以我爲我開發這個插件... Here是一個通過RSS提要獲取所有內容的網站http://fulltextrssfeed.com/ – vs7

+0

如果他們實際上是你的博客,那你爲什麼不直接連接到數據庫並從那裏提取內容?或者甚至更好,只需使用WordPress導入/導出工具,將您的內容合併到一個博客並丟棄其他內容?有5-6個博客,當你可以擁有5-6個類別的博客時,看起來有點麻煩......特別是如果你想要一個博客上的所有內容... – Svish

+0

我需要它們自動同步...並且我爲每個博客使用了不同的服務器 – vs7