0
我在Python和gensim中是全新的。我試圖在windows7(64)上使用Python 3.4中的gensim中的word2vec。在Python中執行Word2Vec時出錯
import csv
with open('Data.csv', 'r') as csvfile:
Word2VecTextTrain = csv.reader(csvfile, delimiter=' ')
from gensim.models import Word2Vec
model = Word2Vec(Word2VecTextTrain, size=100, window=3, min_count=5, workers=4)
「Data.csv」包含30k行文本。這些文本可以是完整或不完整的句子,包括最多20個單詞。其中一些可能包含「/」或數字。
我面對這個錯誤:
Traceback (most recent call last):
File "C:/Users/Home/PycharmProjects/Word2Vec Project/Word2Vec_2016_03_23", line 26, in <module>
model = Word2Vec(Word2VecTextTrain, size=100, window=5, min_count=5, workers=4)
File "C:\Users\Home\Miniconda3\lib\site-packages\gensim\models\word2vec.py", line 431, in __init__
self.build_vocab(sentences, trim_rule=trim_rule)
File "C:\Users\Home\Miniconda3\lib\site-packages\gensim\models\word2vec.py", line 497, in build_vocab
self.finalize_vocab() # build tables & arrays
File "C:\Users\Home\Miniconda3\lib\site-packages\gensim\models\word2vec.py", line 625, in finalize_vocab
self.reset_weights()
File "C:\Users\Home\Miniconda3\lib\site-packages\gensim\models\word2vec.py", line 932, in reset_weights
self.syn0[i] = self.seeded_vector(self.index2word[i] + str(self.seed))
File "C:\Users\Home\Miniconda3\lib\site-packages\gensim\models\word2vec.py", line 946, in seeded_vector
once = random.RandomState(uint32(self.hashfxn(seed_string)))
OverflowError: Python int too large to convert to C long
Process finished with exit code 1
我不知道這個錯誤的原因。任何幫助是真正的讚賞。
感謝您的建議。不幸的是,它給了我同樣的錯誤。 – user3439050
你能分享輸入文件嗎? – kampta