2016-12-07 164 views
0

我想弄清楚如何製作一個程序,該文件需要用戶選擇的文件(通過輸入文件名)並計算每個單詞的頻率投入。在Python中的文本文件中計算單詞的頻率

我有大部分,但是當我在多個詞的程序,找出輸入,僅第一字顯示正確的頻率,其餘顯示爲「0次出現」

file_name = input("What file would you like to open? ") 
f = open(file_name, "r") 
the_full_text = f.read() 
words = the_full_text.split() 
search_word = input("What words do you want to find? ").split(",") 
len_list = len(search_word) 

word_number = 0 
print() 
print ('... analyzing ... hold on ...') 
print() 
print ('Frequency of word usage within', file_name+":") 
for i in range(len_list): 

    frequency = 0 
    for word in words: 
     word = word.strip(",.") 
     if search_word[word_number].lower() == word.lower(): 
      frequency += 1 
    print (" ",format(search_word[word_number].strip(),'<20s'),"/", frequency, "occurrences") 
    word_number = word_number + 1 

等的例子輸出將是:

What file would you like to open? assignment_8.txt 
What words do you want to find? wey, rights, dem 

... analyzing ... hold on ... 

Frequency of word usage within assignment_8.txt: 
    wey    /96 occurrences 
    rights    /0 occurrences 
    dem    /0 occurrences 

我的程序出了什麼問題?請幫忙:o

+2

如果你在''分裂,'',你的輸入不應該是''wey,rights,dem'',沒有空白嗎? –

回答

1

您需要去掉搜索詞中的空格。

但是,您當前的算法效率非常低,因爲它必須重新掃描每個搜索詞的整個文本。這是一個更有效的方法。首先,我們清理搜索詞並將其放入列表中。然後,我們在該列表中建立一個字典,以便在文本文件中找到它們時存儲每個這些字詞的計數。

file_name = input("What file would you like to open? ") 
with open(file_name, "r") as f: 
    words = f.read().split() 

search_words = input("What words do you want to find? ").split(',') 
search_words = [word.strip().lower() for word in search_words] 
#print(search_words) 
search_counts = dict.fromkeys(search_words, 0) 

print ('\n... analyzing ... hold on ...') 
for word in words: 
    word = word.rstrip(",.").lower() 
    if word in search_counts: 
     search_counts[word] += 1 

print ('\nFrequency of word usage within', file_name + ":") 
for word in search_words: 
    print(" {:<20s}/{} occurrences".format(word, search_counts[word])) 
相關問題