按最大值排序字典：類方法

我想對特定字典進行排序並返回top_n次出現次數的列表。字典是來自txt文檔的單詞的集合，其中「關鍵字」是來自txt文件的單個單詞，並且「值」是其在文檔中出現的次數。按最大值排序字典：類方法

我有初始化方法如下：

def __init__(self:'Collection_of_words', file_name: str) -> None: 
    ''' this initializer will read in the words from the file, 
    and store them in self.counts''' 
    l_words = open(file_name).read().split() 
    s_words = set(l_words) 
    self.counts = dict([ [word, l_words.count(word)] 
         for word 
         in s_words])

現在，我的實例方法之一將返回安寧一些int參數出現的「頂部N」數字的字符串列表。我給它一個鏡頭：

def top_n_words(self, i): 
    '''takes one additional parameter, an int, 
    <i> which is the top number of occurences. Returns a list of the top <i> words.''' 


    return [ pair[0] 
      for pair 
      in sorted(associations, key=lambda pair: pair[1], reverse=True)[:5]]

但是，每當我運行此代碼我得到的錯誤，不能找出原因。我不知道如何解釋的對象（如self.counts）

來源

2014-02-09 Professor Scientist

你只是需要這個工作，或者你是否試圖找出方法來做到這一點的學習練習？有一個內置的類，將以更有效的方式爲你做這件事，'collections.Counter'。 –

不，我試圖把這個出來作爲一個學習練習與藏品的幫助.Counter。 –

sorted(self.counts, key=lambda pair: pair[1], reverse=True)

遍歷self.counts給人的鑰匙，而不是鍵值對排序。這意味着pair[1]將無法正常工作。你想要key=self.counts.get。

如果您的列表需要包括計數以及按鍵，你需要改爲由值的鍵值對排序：

sorted(self.counts.items(), key=operator.itemgetter(1), reverse=True)

另外，還要注意collections.Counter已經做了你需要什麼，用線性時間而不是二次的計數算法。

來源

2014-02-09 00:22:51 user2357112

你也可以使用'sorted（self.counts.items（），key = operator.itemgetter（1），reverse = True）'。通常情況下，我會傾向於這樣做，但由於您只是在原始調用中尋找鍵（'pair [0]'），所以使用'get'可能會更清晰。 –

我通過創建包含dict_items的變量'associations'來解決這個問題。 e.x .:

associations = self.counts.items() 

>>> associations 
>>>dict_items([('would,', 1), ('Even', 1), ('Cries', 1), ('Sings', 5)])

然後我在列表理解中使用這個變量。我通過創建一個lambda函數並索引對中的第二個元素，以降序（從最大到最小）對關聯進行排序。出現次數最多的單詞將位於列表中的索引[0]處。

def top_n_words(self, i): 

    associations = self.counts.items() 

     return [ pair[0] 
      for pair 
      in sorted(associations, key=lambda pair: pair[1], reverse=True)[:i]]

來源

2014-02-09 01:19:48

按最大值排序字典：類方法

回答

相關問題