2016-09-07 51 views
0

我有一個數據幀DF,看起來像爲什麼沒有使用Word2Vec的輸出?

index posts 
0  <div class="content">A number of <br/><br/>three ... </div> 
1  <div class="content">Stack ... <br/><br/>overflow ... </div> 
... 

我再嘗試來標記每posts有:

logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s',\ 
level=logging.INFO) 

num_features = 100 
min_word_count = 7 
num_workers = 2 
context = 5 
downsampling = 1e-5 

print "Training model..." 
model = word2vec.Word2Vec(sentences,  workers=num_workers, \ 
     size=num_features, min_count = min_word_count, \ 
     window = context, sample = downsampling) 

model.init_sims(replace=True) 

Word2Vec.load() 
model_name = "what" 
model.save(model_name) 
print "finished" 

然後我:

sentences=[] 
for post in DF["posts"]: 
    sentences += utility.tosentences(post, tokenizer) 

我用下面然後運行Word2Vec測試如下

model.doesnt_match("travel no Warning health".split()) 

但是,它根本沒有產生輸出

我不明白我上面得到的大輸出的含義。爲什麼這不起作用?

回答

0

功能model.doesnt_match()不打印任何東西;它返回一個值。 打印返回的值來查看輸出。

如果你是從這個word2vec tutorial複製粘貼:它顯示你看到的輸出,如果你在交互式控制檯中運行這些命令。 (另外,它假設你明白你在做什麼。)