2017-02-15 90 views
3

在下面的代碼中,我想要計算word_listword_list中每個單詞的出現次數,下面的代碼可以完成這項工作,但效率可能不高,有沒有更好的方法做它?列表2中列表1的Python count元素髮生

word_list = ["hello", "wonderful", "good", "flawless", "perfect"] 
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"] 

result = [0] * len(word_list) 
for i in range(len(word_list)): 
    for w in test: 
     if w == word_list[i]: 
      result[i] += 1 

print(result) 

回答

6

使用collections.Counter算在test所有的字一氣呵成,然後就得到了Counter是計數每個單詞word_list

>>> word_list = ["hello", "wonderful", "good", "flawless", "perfect"] 
>>> test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"] 
>>> counts = collections.Counter(test) 
>>> [counts[w] for w in word_list] 
[1, 0, 3, 0, 0] 

或使用字典理解中:

>>> {w: counts[w] for w in word_list} 
{'perfect': 0, 'flawless': 0, 'good': 3, 'wonderful': 0, 'hello': 1} 

創建計數器應該是O(n),並且在每個查找O(1),給你O(N + M)爲n個字test和m個詞word_list

+0

先做過濾不是更有效嗎? 此外,參考該頁面:https://wiki.python.org/moin/TimeComplexity,列表中的查找是O(n),如果將'word_list'轉換爲組。 –

+0

@ZaccharieRamzi今天有什麼「在一組中進行查找?你是第二個暗示這一點的人。我的答案不清楚嗎?我不會在列表中查找,只能在這裏查找字典,這與查找集合中的速度一樣快。另外,什麼過濾? –

+0

是的,你是對的我對我心中的想法感到困惑。 如果你這樣做: 'words = set(word_list); new_test = [單詞測試中的單詞如果單詞在單詞中]; counts = collections.Counter(new_test)' 根據具體情況,您可能會得到更快的結果。 –

3

你可以在線性時間使用字典來做到這一點。

word_list = ["hello", "wonderful", "good", "flawless", "perfect"] 
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"] 

result = [] 
word_map = {} 
for w in test: 
    if w in word_map: 
     word_map[w] += 1 
    else: 
     word_map[w] = 1 

for w in word_list: 
    result.append(word_map.get(w, 0)) 

print(result) 
+2

尼斯「無庫」解決方案,但即使如此,你可以使用'GET'與默認情況下,例如使代碼有點更緊湊'result.append(word_map.get(w,0))' –

1

您可以結合collections.Counteroperator.itemgetter

from collections import Counter 
from operator import itemgetter 

cnts = Counter(test) 
word_cnts = dict(zip(word_list, itemgetter(*word_list)(cnts))) 

其中給出:

>>> word_cnts 
{'flawless': 0, 'good': 3, 'hello': 1, 'perfect': 0, 'wonderful': 0} 

,或者如果您更希望它作爲list

>>> list(zip(word_list, itemgetter(*word_list)(cnts))) 
[('hello', 1), ('wonderful', 0), ('good', 3), ('flawless', 0), ('perfect', 0)] 
+0

函數式編程的令人印象深刻的顯示,但我仍然喜歡列表或字典理解。 ;-) –

+0

@tobias_k理解已被另一個答案「採取」。否則,我會添加它們:-P – MSeifert

-1

你可以嘗試使用字典:

word_list = ["hello", "wonderful", "good", "flawless", "perfect"] 
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"] 

result = {} 
for word in word_list: 
    result[word]=0 
for w in test: 
    if result.has_key(w): 
     result[w] += 1 
print(result) 

但是你會以不同的結構結束。 如果你不希望出現這種情況,你可以試試這個,而不是

word_list = ["hello", "wonderful", "good", "flawless", "perfect"] 
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"] 

result = {} 
for w in test: 
    if(result.has_key(w)): 
     result[w] += 1 
    else: 
     result[w] = 1 
count = [0] * len(word_list) 
for i in range(len(word_list)): 
    if (result.has_key(word_list[i])): 
     count[i]=result[word_list[i]] 
print(count) 
相關問題