2013-10-11 27 views
7

我試圖從word2vec站點(freebase-vectors-skipgram1000.bin.gz)將pretrained .bin文件加載到word2vec的gensim實現中。該模型加載罰款,在gensim python中使用谷歌word2vec .bin文件

使用..

model = word2vec.Word2Vec.load_word2vec_format('...../free....-en.bin', binary= True) 

,並創建一個

>>> print model 
<gensim.models.word2vec.Word2Vec object at 0x105d87f50> 

但是當我運行最相似的功能。它無法找到詞彙中的單詞。我的錯誤代碼如下。

任何想法我錯了嗎?

>>> model.most_similar(['girl', 'father'], ['boy'], topn=3) 
2013-10-11 10:22:00,562 : WARNING : word ‘girl’ not in vocabulary; ignoring it 
2013-10-11 10:22:00,562 : WARNING : word ‘father’ not in vocabulary; ignoring it 
2013-10-11 10:22:00,563 : WARNING : word ‘boy’ not in vocabulary; ignoring it 
Traceback (most recent call last): 
File 「」, line 1, in 
File 「/....../anaconda/python.app/Contents/lib/python2.7/site-packages/gensim-0.8.7/py2.7.egg/gensim/models/word2vec.py」, line 312, in most_similar 
raise ValueError(「cannot compute similarity with no input」) 
ValueError: cannot compute similarity with no input 

回答

7

在」 ..... /免費....- en.bin'這句話有

形式

EN/boardwalk_chapel EN/mutsu_munemitsu EN/goffstown EN/yaw_axis EN/john_e_fogarty_international_center EN/francielle_manoel_alberto EN/shinji_harada

所以,當你的 '女孩' 它不存在