2016-12-06 119 views
0

我有一個需要大量單詞的循環,將每個單詞分解爲字母並將它們附加到一個大列表中。減少循環的運行時間並提高效率

然後我檢查出現最信,如果未出現在字符串中,我把它存儲在具有兩個空間的列表:

list[0] =字母出現的最

list[1] =多少次出現

這個循環是超級低效。它可以工作,但返回一個值需要大約25-30秒。在此之前,它會繼續下去,不會返回任何值。

如何提高我編寫的代碼的效率?

def choose_letter(words, pattern): 
    list_of_letters = [] 
    first_letter = [] # first spot is the letter, second is how many times it appears 
    second_letter =[] # first spot is letter, second how many times it appears 
    max_appearances = ["letter", 0] 
    for i in range(len(words)): # splits up every word into letters 
     list_of_letters.append(list(words[i])) 
    list_of_letters = sum(list_of_letters, []) # concatenates the lists within the list 
    first_letter = list_of_letters.count(0) 
    for j in list_of_letters: 
     second_letter = list_of_letters.count(j) 
     if second_letter >= max_appearances[1] and j not in pattern: 
      max_appearances[0] = j 
      max_appearances[1] = second_letter 
     else: 
      list_of_letters.remove(j) 
    return max_appearances[0] 
+2

這可能是http://codereview.stackexchange.com/ – Blorgbeard

+1

更好的候選人,當你移到codereview時,他們會要求查看你對該代碼運行的分析器的輸出。 –

+2

看起來像['collections.Counter']的工作(https://docs.python.org/3.5/library/collections.html#collections.Counter)。 – bereal

回答

0

使其更快的一種方法是選擇更好的數據結構。下面是使用collections.Counter一個例子:

from collections import Counter 

def choose_letter(words, pattern): 
    pattern = set(pattern) 
    letters = (letter 
       for word in words 
       for letter in word 
       if letter not in pattern) 
    letters = Counter(letters) 
    return letters.most_common(1)[0][0] 


mywords = 'a man a plan a canal panama'.split() 
vowels = 'aeiou' 
assert choose_letter(mywords, vowels) == 'n' 

下面是一個使用collections.defaultdict

from collections import defaultdict 

def choose_letter(words, pattern): 
    pattern = set(pattern) 
    counts = defaultdict(int) 
    for word in words: 
     for letter in word: 
      if letter not in pattern: 
       counts[letter] += 1 
    return max(counts, key=counts.get) 

mywords = 'a man a plan a canal panama'.split() 
vowels = 'aeiou' 
assert choose_letter(mywords, vowels) == 'n' 
0

你做了很多循環&,你不需要操作列表。每當您執行countnot in時,您都會強制程序循環查看列表/字符串以找到要查找的內容。從列表中刪除所有這些項目也非常昂貴。一個更優雅的解決方案是循環訪問一次單詞/字母列表,然後使用字典來計算每個字母的出現次數。從那裏,你有一個字符/字符對&你可以從那裏得到鍵/值,排序清單&看前兩個值。

from collections import defaultdict 
from itertools import chain 

def choose_letter(words, pattern=""): 
    count_dict = defaultdict(int) # all unknown values default to 0 
    for c in chain(*words): 
     count_dict[c] += 1 
    # you could replace this "not in" with something more efficient 
    filtered = [(char, count) for (char,count) in count_dict.items() if char not in pattern] 
    filtered.sort(lambda a,b: -cmp(a[0], b[0])) 
    print filtered 
    return filtered[0][0] 

如果你不希望深入論證拆包,itertools & defaultdicts,你可以只說:

count_dict = {} 
for word in words: 
    for char in word: 
     count_dict[char] = count_dict.get(char, 0) + 1 

...如果你不想嘗試挖掘到的參數拆包然而。