2016-07-24 38 views
0

我正在使用feedparser模塊在我的程序中創建新聞提要。 Yahoo!代碼:單獨的RSS提要鏈接/ s

Yahoo! Finance API鏈接元素實際上有兩個鏈接:Yahoo鏈接和實際文章鏈接(外部網站/源)。兩個由一個星號分離,用下面的就是一個例子:

http://us.rd.yahoo.com/finance/external/investors/rss/SIG=12shc077a/ * http://www.investors.com/news/technology/click/pokemon-go-hurting-facebook-snapchat-usage/

注意兩個項目之間的星號。

我只是想知道是否有pythonic的方式來分開這兩個,只讀第二個鏈接到一個文件。

謝謝你的時間。

這裏是我的相關代碼:

def parse_feed(news_feed_message, rss_url): 
    ''' This function parses the Yahoo! RSS API for data of the latest five articles, and writes it to the company news text file''' 

    # Define the RSS feed to parse from, as the url passed in of the company the user chose 
    feed = feedparser.parse(rss_url) 

    # Define the file to write the news data to the company news text file 
    outFile = open('C:\\Users\\nicks_000\\PycharmProjects\\untitled\\SAT\\GUI\\Text Files\\companyNews.txt', mode='w') 

    # Create a list to store the news data parsed from the Yahoo! RSS 
    news_data_write = [] 
    # Initialise a count 
    count = 0 
    # For the number of articles to append to the file, append the article's title, link, and published date to the news_elements list 
    for count in range(10): 
     news_data_write.append(feed['entries'][count].title) 
     news_data_write.append(feed['entries'][count].published) 
     news_data_write.append(feed['entries'][count].link) 
     # Add one to the count, so that the next article is parsed 
     count+=1 
     # For each item in the news_elements list, convert it to a string and write it to the company news text file 
     for item in news_data_write: 
      item = str(item) 
      outFile.write(item+'\n') 
     # For each article, write a new line to the company news text file, so that each article's data is on its own line 
     outFile.write('\n') 
     # Clear the news_elements list so that data is not written to the file more than once 
     del(news_data_write[:]) 
    outFile.close() 

    read_news_file(news_feed_message) 

回答

0

您可以分割此方式如下:

link = 'http://us.rd.yahoo.com/finance/external/investors/rss/SIG=12shc077a/*http://www.investors.com/news/technology/click/pokemon-go-hurting-facebook-snapchat-usage/' 

rss_link, article_link = link.split('*') 

請記住,這需要總是包含星號的鏈接,否則你會得到以下例外:

ValueError: not enough values to unpack (expected 2, got 1) 

如果你只需要第二個鏈接,你也可以w儀式:

_, article_link = link.split('*') 

這表明你想放棄第一個返回值。 另一種選擇是:

article_link = link.split('*')[1] 

關於你的代碼:如果你有一個例外,你打開輸出文件後的任何地方,它不會被正常關閉。可以使用open上下文管理器(docs)或try ... finally塊(docs)確保無論發生什麼情況都關閉文件。

情景管理:

with open('youroutputfile', 'w') as f: 
    # your code 
    f.write(…) 

異常處理程序:

try: 
    f = open('youroutputfile', 'w') 
    f.write(…) 
finally: 
    f.close()