2014-01-10 65 views
2

我正在編寫一個程序,它將接受文本作爲輸入。Python遞增

該程序的值爲「tone」,從0開始。 當文本中看到單詞「posfeats」中的單詞時,音調增加+1。 當在文本中看到單詞「negfeats」中的單詞時,音調遞增-1。

但是,無論輸入什麼文本,我的代碼都會返回值爲「tone」的0。我覺得這是由於我的錯誤Python編程,而不是我的算法。

下面是代碼:

import nltk.classify.util 
from nltk.classify import NaiveBayesClassifier 
from nltk.corpus import movie_reviews #importing two corpora, movie_reviews and stopwords 
from nltk.corpus import stopwords 

def word_feats(words): 
stops = dict([(word, True) for word in stopwords.words('english')]) #English stopwords 
features = dict([(word, True) for word in words if word not in stops])#features minus stopwords 
return features 

def compare(words, negfeats, posfeats): 
sentiment=0 
for word in words: 
    if word in negfeats: 
     sentiment -= 1 
    if word in posfeats: 
     sentiment += 1 
return sentiment 


negReviews = reviews.fileids('neg') 
posReviews = reviews.fileids('pos') 

negfeats = [(word_feats(reviews.words(fileids=[f])), 'neg') for f in negReviews] 
posfeats = [(word_feats(reviews.words(fileids=[f])), 'pos') for f in posReviews] 

opinion = raw_input("Why don't you tell me about a movie you watched recently?\n\n") 
tone = compare(opinion.split(), negfeats, posfeats) 
print(str(tone)) #THIS KEEPS RETURNING 0 
+0

爲什麼在compare()中添加「tone = 0」作爲第一行?我不確定它是如何影響返回值的,但我不認爲它應該在那裏,而應該是'sentiment = 0'! –

+1

如果你想要一個沒有重複元素的無序集合,那麼使用'set'而不是'dict'來忽略值會更清晰。 – user2357112

+0

@Kohler,你說得對,當我將它複製到stackoverflow時,我完全寫錯了。 – Shuklaswag

回答

1
negfeats = [(word_feats(reviews.words(fileids=[f])), 'neg') for f in negReviews] 
posfeats = [(word_feats(reviews.words(fileids=[f])), 'pos') for f in posReviews] 

您的意思是在這裏有dict電話? negfeatsposfeats(word, 'neg')(word, 'pos')元組的列表。 compare將在這些列表中搜索單詞並找不到任何單詞,因爲這些單詞嵌套在元組中。當然,最好使用set作爲無序集合而不需要重複。

+0

出於某種原因,我在將它複製到stackoverflow時輸入錯誤。我的代碼現在確實說情緒= 0。但我仍然得到0的值... – Shuklaswag

+0

@ user3180238:然後,不要重新鍵入它。複製/粘貼更容易,更可靠。 – user2357112

+0

@ user3180238:已更新答案以反映您的代碼的新版本。 – user2357112