2017-02-26 21 views
0

我正在使用PyCharm和加載使用Word2Vec在單詞上訓練的模型。我試圖檢查兩個詞之間的相似性,但我得到這個錯誤:Word2Vec相似性功能不工作

# Loading model trained on words 
    model = word2vec.Word2Vec.load('models/text8.model') 

    # Loading model enhanced with phrases (2-grams) 
    model_phrase = word2vec.Word2Vec.load('models/text8.phrase.model') 

    # Words that are similar are close in the sense of the cosine similarity. 
    sim = model.similarity('woman', 'man') 
    print 'Printing word similarity between "woman" and "man" : {0}'.format(sim) 

Traceback (most recent call last): 
File "C:\Program Files (x86)\JetBrains\PyCharm 2016.3.2\helpers\pydev\pydevd.py", line 1596, in <module> 
globals = debugger.run(setup['file'], None, None, is_module) 
File "C:\Program Files (x86)\JetBrains\PyCharm 2016.3.2\helpers\pydev\pydevd.py", line 974, in run 
pydev_imports.execfile(file, globals, locals) # execute the script 
File "C:/Users/XXX/Desktop/code/word2vec/embedding_word2vec_students.py", line 144, in <module> 
sim = model.similarity('woman', 'man') 
File "C:\Users\XXX\Anaconda3\lib\site-packages\gensim\models\word2vec.py", line 1194, in similarity 
return self.wv.similarity(w1, w2) 
File "C:\Users\XXX\Anaconda3\lib\site-packages\gensim\models\keyedvectors.py", line 587, in similarity 
return dot(matutils.unitvec(self[w1]), matutils.unitvec(self[w2])) 
File "C:\Users\XXX\Anaconda3\lib\site-packages\gensim\models\keyedvectors.py", line 567, in __getitem__ 
return self.word_vec(words) 
File "C:\Users\XXX\Anaconda3\lib\site-packages\gensim\models\keyedvectors.py", line 271, in word_vec 
return self.syn0[self.vocab[word].index] 
IndexError: list index out of range 

當我調試,似乎問題來自這行的功能word_vec:

return self.syn0[self.vocab[word].index] 

不過我我不知道爲什麼我要這樣做。如果你能幫助我,請提前非常感謝。

+1

以下每個回報是什麼:(1)'model.wv ['man']'; (2)'model.wv ['woman']'; (3)'len(model.wv.syn0)'; (4)'model.wv.vocab ['man']。index'; (5)'model.wv.vocab ['woman']。index'? – gojomo

回答

0

聽起來像'女人'或'男人'可能不是你的詞彙的一部分。我要檢查的第一件事是它們是否出現在您加載的模型中。