2016-04-06 65 views
0

我想用python 2.7編寫一個twitter情緒分析程序Scikit-learn。 OS是Linux Ubuntu 14.04。Hashingvectorizer和Multinomial樸素貝葉斯不在一起工作

在矢量化步驟中,我想使用Hashingvectorizer()。爲了測試分類準確度,正常工作與LinearSVCNuSVCGaussianNBBernoulliNBLogisticRegression分類,但對於MultinomialNB,它返回該錯誤

Traceback (most recent call last): 
    File "/media/test.py", line 310, in <module> 
    classifier_rbf.fit(train_vectors, y_trainTweets) 
    File "/home/.local/lib/python2.7/site-packages/sklearn/naive_bayes.py", line 552, in fit 
    self._count(X, Y) 
    File "/home/.local/lib/python2.7/site-packages/sklearn/naive_bayes.py", line 655, in _count 
    raise ValueError("Input X must be non-negative") 
ValueError: Input X must be non-negative 
[Finished in 16.4s with exit code 1] 

下面是與此相關的錯誤

vectorizer = HashingVectorizer() 
train_vectors = vectorizer.fit_transform(x_trainTweets) 
test_vectors = vectorizer.transform(x_testTweets) 

classifier_rbf = MultinomialNB() 
classifier_rbf.fit(train_vectors, y_trainTweets) 
prediction_rbf = classifier_rbf.predict(test_vectors) 
塊碼

爲什麼它正在發生,我該如何解決它?

回答

1

您需要設置non_negative參數True,在sklearn初始化您的矢量化

vectorizer = HashingVectorizer(non_negative=True) 
+1

時0.19+它應該是'HashingVectorizer(alternate_sign = FALSE)」 –