在Python中每循環迭代清空列表？

-1

我通過Reddit上多篇文章試圖循環，經過每一篇文章，並提取相關的頂級實體（通過篩選獲得最高關聯得分完成），然後添加到列表master_locations：在Python中每循環迭代清空列表？

from __future__ import print_function 
from alchemyapi import AlchemyAPI 
import json 
import urllib2 
from bs4 import BeautifulSoup 

alchemyapi = AlchemyAPI() 
reddit_url = 'http://www.reddit.com/r/worldnews' 
urls = [] 
locations = [] 
relevance = [] 
master_locations = [] 

def get_all_links(page): 
    html = urllib2.urlopen(page).read() 
    soup = BeautifulSoup(html) 
    for a in soup.find_all('a', 'title may-blank ', href=True): 
     urls.append(a['href']) 
     run_alchemy_entity_per_link(a['href']) 

def run_alchemy_entity_per_link(articleurl): 
    response = alchemyapi.entities('url', articleurl) 
    if response['status'] == 'OK': 
     for entity in response['entities']: 
      if entity['type'] in entity == 'Country' or entity['type'] == 'Region' or entity['type'] == 'City' or entity['type'] == 'StateOrCountry' or entity['type'] == 'Continent': 
       if entity.get('disambiguated'): 
        locations.append(entity['disambiguated']['name']) 
        relevance.append(entity['relevance']) 
       else: 
        locations.append(entity['text']) 
        relevance.append(entity['relevance'])   
      else: 
       locations.append('No Location') 
       relevance.append('0') 
     max_pos = relevance.index(max(relevance)) # get nth position of the highest relevancy score 
     master_locations.append(locations[max_pos]) #Use n to get nth position of location and store that location name to master_locations 
     del locations[0] # RESET LIST 
     del relevance[0] # RESET LIST 
    else: 
     print('Error in entity extraction call: ', response['statusInfo']) 

get_all_links('http://www.reddit.com/r/worldnews') # Gets all URLs per article, then analyzes entity 

for item in master_locations: 
    print(item)

但我認爲出於某種原因，列表locations和relevance未被重置。我做錯了嗎？

印刷本的結果是：

Holland 
Holland 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Beirut 
Mogadishu 
Mogadishu 
Mogadishu 
Mogadishu 
Mogadishu 
Mogadishu 
Mogadishu 
Mogadishu 
Johor Bahru

（可能從列表中不被清除）

來源

2014-09-06 Phillipe Dongwoo Han

我已經低估了，因爲這是一段長長的代碼，大多不相關，可能已經被簡化了很多。 http://sscce.org/ – Davidmh 2014-09-06 10:05:46

del list[0]只刪除列表中的第一項。

如果要刪除所有項目，使用下列內容：

del list[:]

或

list[:] = []

來源

2014-09-06 08:34:32 falsetru

嘗試將列表更改爲'locations [：] = []'和'relevance [：] = []'，但是我得到一個'ValueError：max（）arg是一個空序列錯誤。 – 2014-09-06 09:33:25

@PhillipeDongwooHan，在'del'語句前用'if relevance：'守衛兩行。 – falsetru 2014-09-06 09:35:07

謝謝！這固定它！但是，你能簡單解釋一下爲什麼這樣做有效嗎爲什麼要放置一個if條件？ – 2014-09-06 09:51:19

在你的情況，不要重複使用的清單，只要創建新的：

from __future__ import print_function 
from alchemyapi import AlchemyAPI 
import json 
import urllib2 
from bs4 import BeautifulSoup 

alchemyapi = AlchemyAPI() 
reddit_url = 'http://www.reddit.com/r/worldnews' 

def get_all_links(page): 
    html = urllib2.urlopen(page).read() 
    soup = BeautifulSoup(html) 
    urls = [] 
    master_locations = [] 
    for a in soup.find_all('a', 'title may-blank ', href=True): 
     urls.append(a['href']) 
     master_locations.append(run_alchemy_entity_per_link(a['href'])) 
    return urls, master_locations 

def run_alchemy_entity_per_link(articleurl): 
    response = alchemyapi.entities('url', articleurl) 
    if response['status'] != 'OK': 
     print('Error in entity extraction call: ', response['statusInfo']) 
     return 
    locations_with_relevance = [] 
    for entity in response['entities']: 
     if entity['type'] in ('Country', 'Region', 'City', 'StateOrCountry', 'Continent'): 
      if entity.get('disambiguated'): 
       location = entity['disambiguated']['name'] 
      else: 
       location = entity['text'] 
      locations_with_relevance.append((int(entity['relevance']), location)) 
     else: 
      locations_with_relevance.append((0, 'No Location')) 
    return max(locations_with_relevance)[1] 

def main(): 
    _urls, master_locations = get_all_links(reddit_url) # Gets all URLs per article, then analyzes entity 

    for item in master_locations: 
     print(item) 

if __name__ == '__main__': 
    main()

當您有多個項目存儲在列表中時，將項目放入一個元組中，並將元組放入一個列表中，而不是兩個或多個sep憤怒的名單。

來源

2014-09-06 08:52:16 Daniel

嗯..試着運行你的代碼，我得到了'TypeError：'列表'對象不可調用'？ – 2014-09-06 09:32:09

@PhillipeDongwooHan：改正。無論如何，它更多的是看代碼並找出差異。 – Daniel 2014-09-06 10:03:10

在Python中每循環迭代清空列表？

回答

相關問題