4
我可以從單個文件輸入測試數據,沒有任何問題。然而,每當我嘗試從目錄中的多個文件輸入數據時,我會得到以下錯誤:AttributeError:'NoneType'對象沒有屬性'lower'。請參閱下面的代碼,我會很感激任何幫助。謝謝。SKlearn:通過讀取目錄中的多個文件來加載訓練數據
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from nltk.corpus import stopwords
import numpy as np
import numpy.linalg as LA
import os
path = "C:\zircon"
def radfil():
for file in os.listdir(path):
current = os.path.join(path, file)
if os.path.isfile(current):
data = open(current, "rb").read()
print data
train_set = [radfil()]
test_set = ["The sun in the sky is bright."]
stopWords = stopwords.words('english')
vectorizer = CountVectorizer(stop_words=stopWords, min_df=1)
#print vectorizer
transformer = TfidfTransformer()
#print transformer
trainVectorizerArray = vectorizer.fit_transform(train_set).toarray()
testVectorizerArray = vectorizer.transform(test_set).toarray()
print 'Fit Vectorizer to train set', trainVectorizerArray
print 'Transform Vectorizer to test set', testVectorizerArray
可以粘貼整個堆棧跟蹤嗎? – Steve