0
對於返回關鍵字列表的每篇文章。我們希望使用鍵 - >值將所有單詞連接到列表中,如下所示。在我追加之前,我想從列表中刪除'u'。然後我們想比較兩個列表中的多少個常用單詞並返回結果。加入列表中的關鍵字NYT
實施例列出了從dic['keywords']
返回:
第一個返回:
[
{
u'value': u'Dunford, Joseph F Jr',
u'name': u'persons',
u'rank': u'1'
},
{
u'value': u'Afghanistan',
u'name': u'glocations',
u'rank': u'1'
},
{
u'value': u'Afghan National Police',
u'name': u'organizations',
u'rank': u'1'
},
{
u'value': u'Afghanistan War (2001-)',
u'name': u'subject',
u'rank': u'1'
},
{
u'value': u'Defense and Military Forces',
u'name': u'subject',
u'rank': u'2'
}
]
第2返回:
[
{
u'value': u'Gall, Carlotta',
u'name': u'persons',
u'rank': u'1'
},
{
u'value': u'Gannon, Kathy',
u'name': u'persons',
u'rank': u'2'
},
{
u'value': u'Niedringhaus, Anja (1965-2014)',
u'name': u'persons',
u'rank': u'3'
},
{
u'value': u'Kabul (Afghanistan)',
u'name': u'glocations',
u'rank': u'2'
},
{
u'value': u'Afghanistan',
u'name': u'glocations',
u'rank': u'1'
},
{
u'value': u'Afghan National Police',
u'name': u'organizations',
u'rank': u'1'
},
{
u'value': u'Afghanistan War (2001-)',
u'name': u'subject',
u'rank': u'1'
}
]
所需的輸出:
List1 = ['Dunford, Joseph F Jr',’ Afghanistan’, ‘Afghan National Police’, ‘: Afghanistan War (2001-)’, ‘Defense and Military Forces’]
List2 = [‘Gall, Carlotta'’,’ u'Gannon, Kathy',’ Niedringhaus, Anja (1965-2014)’,’Afghanistan’]
關鍵詞共同點:2
我的代碼如下:
from flask import Flask, render_template, request, session, g, redirect, url_for
from nytimesarticle import articleAPI
api = articleAPI('X')
articles = api.search(q = 'Afghan War',
fq = {'headline':'', 'source':['Reuters','AP', 'The New York Times']},
begin_date = 20111231)
def parse_articles(articles):
'''
This function takes in a response to the NYT api and parses
the articles into a list of dictionaries
'''
news = []
for i in articles['response']['docs']:
dic = {}
dic['id'] = i['_id']
if i['abstract'] is not None:
dic['abstract'] = i['abstract'].encode("utf8")
dic['headline'] = i['headline']['main'].encode("utf8")
dic['desk'] = i['news_desk']
dic['date'] = i['pub_date'][0:10] # cutting time of day.
dic['section'] = i['section_name']
dic['keywords'] = i['keywords']
print dic['keywords']
if i['snippet'] is not None:
dic['snippet'] = i['snippet'].encode("utf8")
dic['source'] = i['source']
dic['type'] = i['type_of_material']
dic['url'] = i['web_url']
dic['word_count'] = i['word_count']
# locations
locations = []
for x in range(0,len(i['keywords'])):
if 'glocations' in i['keywords'][x]['name']:
locations.append(i['keywords'][x]['value'])
dic['locations'] = locations
# subject
subjects = []
for x in range(0,len(i['keywords'])):
if 'subject' in i['keywords'][x]['name']:
subjects.append(i['keywords'][x]['value'])
dic['subjects'] = subjects
news.append(dic)
return(news)
print(parse_articles(articles))