2016-12-18 22 views
0

你好,我想提出以下的實驗中,首先我創建了一個名爲矢量器:TFIDF:爲什麼下面的tfidf矢量化失敗了?

tfidf_vectorizer = TfidfVectorizer(min_df=10,ngram_range=(1,3),analyzer='word',max_features=500) 

然後我向量化以下列表:

tfidf = tfidf_vectorizer.fit_transform(listComments) 

我的評論列表如下所示:

listComments = ["hello this is a test","the car is red",...] 

我試着保存模型如下:

#Saving tfidf 
with open('vectorizerTFIDF.pickle','wb') as idxf: 
    pickle.dump(tfidf, idxf, pickle.HIGHEST_PROTOCOL) 

我想用我的矢量器相同的TFIDF適用於以下列表:

lastComment = ["this is a car"] 

開場模特:

with open('vectorizerTFIDF.pickle', 'rb') as infile: 
    tdf = pickle.load(infile) 

vector = tdf.transform(lastComment) 

但是我得到:

Traceback (most recent call last): 
    File "C:/Users/LDA_test/ldaTest.py", line 141, in <module> 
    vector = tdf.transform(lastComment) 
    File "C:\Program Files\Anaconda3\lib\site-packages\scipy\sparse\base.py", line 559, in __getattr__ 
    raise AttributeError(attr + " not found") 
AttributeError: transform not found 

我希望有人能夠提前在此提前感謝我支持我,

+0

你醃矢量化陣列,沒有變壓器,則需要''和pickle.dump(tfidf_vectorizer,idxf,pickle.HIGHEST_PROTOCOL)' – maxymoo

+0

@maxymoo,非常感謝您的支持,請您發表完整的答案我會接受,如果這解決了我的問題,感謝支持, – neo33

回答

1

你醃矢量化陣列,沒有變壓器,則需要pickle.dump(tfidf_vectorizer, idxf, pickle.HIGHEST_PROTOCOL)

+0

感謝您的支持我再次嘗試, – neo33

+0

感謝您的支持我真的很感謝現在幫助這個工作很好, – neo33