2017-06-19 20 views
0

我想在python中製作一個程序,它應該輸出examens.txt中發生3次以上的那些trigrams的頻率。單詞和特殊字符的大寫和小寫字母將被忽略,輸出應按頻率排序。trigrams的程序

我的老師告訴我,我只能改變兩行!但即時通訊python失明。對我來說,代碼看起來是正確的,但它沒有工作。

with open("examen.txt") as f: 
    data = f.read() 
    text = data.replace("\xad", "") 

words = [] 
for word in data.lower().split(): 
    word = word.strip("‚‘!,.:«»-()'_#-–„「*?") 
    if word != "": 
     if not word[-1].isalnum(): 
      print(repr(word)) 
     words.append(word) 

trigrams = {} 
for i in range(len(words)): 
    word = words[i] 
    nextword = words[i + 1] 
    nextnextword = words[i + 2] 
    key = (word, nextword, nextnextword) 
    trigrams[key] = trigrams.get(key, 0) + 1 

l = list(trigrams.items()) 
l.sort(key=lambda x: (x[1], x[0])) 
l.reverse() 
for key, count in trigrams: 
    if count < 3: 
     break 
    word = key[0] 
    nextword = key[1] 
    nextnextword = key[2] 
    print(word, nextword, nextnextword, count) 
+0

你試過執行它,做任何錯誤顯示? –

+0

程序輸出了什麼?這與您的期望有何不同? – Cristina

+1

你的'examens.txt'的例子也會有幫助,*的更詳細的解釋不起作用*。它會崩潰嗎?它會產生錯誤的結果嗎? –

回答

0

你穿越過深成words當你建立了卦,你沒有在最後的循環打印正確的數據結構。

更改只是兩行我會寫 -

with open("examen.txt") as f: 
    data = f.read() 
    text = data.replace("\xad", "") 

words = [] 
for word in data.lower().split(): 
    word = word.strip("‚‘!,.:«»-()'_#-–„「*?") 
    if word != "": 
     if not word[-1].isalnum(): 
      print(repr(word)) 
     words.append(word) 

trigrams = {} 
for i in range(len(words) - 2): 
    word = words[i] 
    nextword = words[i + 1] 
    nextnextword = words[i + 2] 
    key = (word, nextword, nextnextword) 
    trigrams[key] = trigrams.get(key, 0) + 1 

l = list(trigrams.items()) 
l.sort(key=lambda x: (x[1], x[0])) 
l.reverse() 
for key, count in l: 
    if count < 3: 
     continue 
    word = key[0] 
    nextword = key[1] 
    nextnextword = key[2] 
    print(word, nextword, nextnextword, count)