2013-07-24 53 views
1

我得到這個錯誤消息NLTK困惑錯誤

Traceback (most recent call last): 
     File "C:/Users/shengrong/Desktop/bigram", line 55, in <module> 
     bg = bigram(file) 
     File "C:/Users/shengrong/Desktop/bigram", line 43, in bigram 
     return tt1.perplexity(my_bigrams) 
     File "C:\Python27\lib\site-packages\nltk\model\ngram.py", line 217, in perplexity 
     return pow(2.0, self.entropy(text)) 
     File "C:\Python27\lib\site-packages\nltk\model\ngram.py", line 205, in entropy 
     e += self.logprob(token, context) 
     File "C:\Python27\lib\site-packages\nltk\model\ngram.py", line 151, in logprob 
     return -log(self.prob(word, context), 2) 
    ValueError: math domain error 




import os,csv,nltk 
from nltk.model.ngram import NgramModel 
from nltk.probability import LidstoneProbDist 

fout = open("/Users/shengrong/Documents/personal/WN1.data.csv", "w") 

outfilehandle = csv.writer(fout, 
          delimiter=",", 
          quotechar='"', 
          quoting=csv.QUOTE_NONNUMERIC) 


localrow = [] 
localrow.append("File name") 
localrow.append("Perplexity for unigram") 
localrow.append("Perplexity for bigram") 
localrow.append("Perplexity for trigram") 
outfilehandle.writerow(localrow) 

def bigram(file): 
    file_object = open(file) 
    ln=file_object.read() 

    words = nltk.word_tokenize(ln) 
    my_bigrams = nltk.bigrams(words) 
    my_trigrams = nltk.trigrams(words) 



    tt1=NgramModel(2, my_bigrams, estimator = None) 


    return tt1.perplexity(my_bigrams)  




#set the path of the folder 
os.chdir("/Users/shengrong/Documents/A") 
s = os.getcwd() 
#search files in the folder 
files = os.listdir(s) 

for file in files: 
    bg = bigram(file) 
    localrow= [] 
    localrow.append(file) 
    localrow.append(bg) 

    outfilehandle.writerow(localrow) 

fout.close() 

如何解決這個問題?沒有使用循環來讀取文件夾,我的代碼運行良好。

謝謝你們。

回答

0

您正在收到的錯誤是數學域錯誤。由於它在沒有循環的情況下工作,因此可能有一個或多個文件包含一些Math數據包無法處理的數據。

請確認文件夾中的所有文件都包含正確格式和期望值的數據。

+0

他們都是txt文件。對於某些文件,它運行良好;但是,對於其他人,則會出現數學域錯誤。 – user2612912