文本查找前5個字長

我試圖寫一個程序，它有兩個功能：文本查找前5個字長

count_word_lengths這需要論證文本，文本字符串，並返回一個默認詞典記錄每個詞長度的計數。一個例子調用此函數：
top5_lengths這需要相同的參數文本並返回前5字長的列表。

注意：在兩種長度具有相同的頻率的情況下，就應該按降序排列。另外，如果長度少於5個字符，它應該返回一個較短的排序字長度列表。

實例調用count_word_lengths：

count_word_lengths("one one was a racehorse two two was one too"): 
    defaultdict(<class 'int'>, {1: 1, 3: 8, 9: 1})

實例調用top5_lengths：

top5_lengths("one one was a racehorse two two was one too") 
    [3, 9, 1] 

top5_lengths("feather feather feather chicken feather") 
    [7] 

top5_lengths("the swift green fox jumped over a cool cat") 
    [3, 5, 4, 6, 1]

我當前的代碼是這樣的，而且似乎輸出所有這些電話，但它是失敗的一個隱藏的測試。我沒有考慮什麼類型的輸入？我的代碼的行爲是否正確？如果沒有，我該如何解決這個問題？

from collections import defaultdict 

length_tally = defaultdict(int) 
final_list = [] 

def count_word_lengths(text): 
    words = text.split(' ') 

    for word in words: 
     length_tally[len(word)] += 1 

    return length_tally 


def top5_word_lengths(text): 
    frequencies = count_word_lengths(text) 
    list_of_frequencies = frequencies.items() 
    flipped = [(t[1], t[0]) for t in list_of_frequencies] 
    sorted_flipped = sorted(flipped) 
    reversed_sorted_flipped = sorted_flipped[::-1] 


    for item in reversed_sorted_flipped: 
     final_list.append(item[1]) 

    return final_list

來源

2016-04-21 Indifferent Potato

我不是Python傢伙，但我可以看到一些可能導致問題的事情。

您繼續參考top5_lengths，但您的代碼有一個函數，稱爲top5_word_lengths。
您可以使用一個名爲count_lengths的函數，該函數在任何地方都沒有定義。

修復這些，看看會發生什麼！

編輯：這不應該影響您的代碼，但它不是偉大實踐，爲您的函數來更新其範圍之外的變量。您可能希望將頂部的變量賦值移至使用它們的函數。

來源

2016-04-21 05:57:50 beane

有一點要注意的是，你不佔空字符串。這會導致count（）返回null/undefined。您也可以列表解析過程中使用iteritems（）來得到一個字典的鍵和值像for k,v in dict.iteritems():

來源

2016-04-21 05:59:49

不是一個真正的答案，但跟蹤的話，而不是僅僅的另一種方式長度：

from collections import defaultdict 

def count_words_by_length(text): 
    words = [(len(word),word) for word in text.split(" ")] 
    d = defaultdict(list) 
    for k, v in words: 
     d[k].append(v) 
    return d 


def top_words(dict, how_many): 
    return [{"word_length": length, "num_words": len(words)} for length, words in dict.items()[-how_many:]]

使用如下：

my_dict = count_words_by_length('hello sir this is a beautiful day right') 
my_top_words = num_top_words_by_length(my_dict, 5) 

print(my_top_words) 
print(my_dict)

輸出：

[{'word_length': 9, 'num_words': 1}] 
defaultdict(<type 'list'>, {1: ['a'], 2: ['is'], 3: ['sir', 'day'], 4: ['this'], 5: ['hello', 'right'], 9: ['beautiful']})

來源

2016-04-21 06:13:07 Yerken

文本查找前5個字長

回答

相關問題