NLTK使用訓練分類

分類界面我的代碼，這個小塊，我發現here：NLTK使用訓練分類

import nltk.classify.util 
from nltk.classify import NaiveBayesClassifier 
from nltk.corpus import movie_reviews 
from nltk.corpus import stopwords 

def word_feats(words): 
    return dict([(word, True) for word in words]) 

negids = movie_reviews.fileids('neg') 
posids = movie_reviews.fileids('pos') 

negfeats = [(word_feats(movie_reviews.words(fileids=[f])), 'neg') for f in negids] 
posfeats = [(word_feats(movie_reviews.words(fileids=[f])), 'pos') for f in posids] 

negcutoff = len(negfeats)*3/4 
poscutoff = len(posfeats)*3/4 

trainfeats = negfeats[:negcutoff] + posfeats[:poscutoff] 
testfeats = negfeats[negcutoff:] + posfeats[poscutoff:] 
print 'train on %d instances, test on %d instances' % (len(trainfeats), len(testfeats)) 

classifier = NaiveBayesClassifier.train(trainfeats) 
print 'accuracy:', nltk.classify.util.accuracy(classifier, testfeats) 
classifier.show_most_informative_features()

但我怎麼能分類隨機單詞，可能是在語料庫。

classifier.classify('magnificent')

不起作用。它需要某種對象嗎？

非常感謝。

編輯：多虧@ unutbu的反饋和一些挖here並在原帖如下產量的POS「或「NEG」這個代碼（這一個是一個「正」）閱讀註釋

print(classifier.classify(word_feats(['magnificent'])))

和這產生單詞的評價爲 'POS' 或 '負'

print(classifier.prob_classify(word_feats(['magnificent'])).prob('neg'))

來源

2013-02-05 storedope

print(classifier.classify(word_feats(['magnificent'])))

產生

pos

classifier.classify方法不會對單個詞本身進行操作，它根據的dict特徵進行分類。在此示例中，word_feats將一個句子（單詞列表）映射到要素的dict。

這是another example（來自NLTK書），它使用NaiveBayesClassifier。通過比較該示例和發佈的示例之間的相似和不同之處，您可以更好地瞭解它如何使用。

來源

2013-02-05 20:58:54 unutbu

NLTK使用訓練分類

回答

相關問題