也就是說，通過頻率排序，在一本書（.txt文件）

-6

from collections import Counter 
wordlist = open('mybook.txt','r').read().split() 
c = Counter(wordlist) 
print c 

# result : 
# Counter({'the': 9530, 'to': 5004, 'a': 4203, 'and': 4202, 'was': 4197, 'of': 3912, 'I': 2852, 'that': 2574, ... })

打印一本書的所有的話，按頻率進行排序。

如何將這個結果寫入.txt輸出文件？

g = open('wordfreq.txt','w') 
g.write(c) # here it fails

這裏是所期望的輸出wordfreq.txt：

的，9530
於5004
一，5004
和，4203
是，4197
...

來源

2015-11-02 Basj

那你嘗試閱讀這一切到內存中一次去做？你有沒有做過任何研究？我希望沒有人回答這個問題，讓你真正嘗試一些東西。所以不要讓某人免費爲你編碼。您需要嘗試並研究如何首先做某件事。 –

@JohnRuddell如果你不覺得回答，那麼*不要這樣做*我嘗試了各種各樣的東西，包括嘗試'json.dumps'這個'dict'然後意識到它不僅是一個'dict'，而且更多複雜的事情......再說一遍，如果你不喜歡這個問題，那麼就不要回答它，但是我沒有看到你評論中的觀點。 – Basj

@Basj幾件事，首先我沒有downvoted你的問題..但其他人有，因爲你的問題顯示沒有嘗試解決這個問題。如果你有'嘗試過各種各樣的東西'然後發佈他們在你的問題。我們可以告訴你你去哪裏錯了......最重要的是，我可以做一個簡單的谷歌搜索和輕鬆找到解決方案。在問一個問題之前缺乏研究，通常也會給你很多downvotes –

如果你想以排序的方式寫它，你可以做到這一點。

from collections import Counter 
wordlist = open('so.py', 'r').read().split() 
word_counts = Counter(wordlist) 

write_file = open('wordfreq.txt', 'w') 
for w, c in sorted(word_counts.iteritems(), key=lambda x: x[1], reverse=True): 
    write_file.write('{w}, {c}\n'.format(w=w, c=c))

來源

2015-11-02 22:52:03

我認爲這可能是您需要的幫助：如何以您請求的格式打印字典。前四行是你的原始代碼。

from collections import Counter 
wordlist = open('so.py', 'r').read().split() 
c = Counter(wordlist) 
print c 

outfile = open('output.txt', 'w') 
for word, count in c.items(): 
    outline = word + ',' + str(count) + '\n' 
    outfile.write(outline)

來源

2015-11-02 22:23:46 Prune

謝謝！只是一件小事：如何讓它按頻率（'count'）排序？在'print c'中，它被神奇地分類！（我三重檢查） – Basj

實際的答案是，你在線搜索「按值排序python字典」。在發表評論之前，你是如何找不到東西的？ – Prune

我認爲這可以做得更簡單一點。此外，我使用的上下文管理器（with）來自動關閉文件

from collections import Counter 

with open('mybook.txt', 'r') as mybook: 
    wordcounts = Counter(mybook.read().split()) 

with open('wordfreq.txt', 'w') as write_file: 
    for item in word_counts.most_common(): 
     print('{}, {}'.format(*item), file=write_file)

如果該文件特別大，你能避免使用

wordcounts = Counter(x for line in mybook for x in line.split())

來源

2017-05-02 01:34:10

也就是說，通過頻率排序，在一本書（.txt文件）

回答

相關問題