0
我有一個數據幀DF
,看起來像爲什麼沒有使用Word2Vec的輸出?
index posts
0 <div class="content">A number of <br/><br/>three ... </div>
1 <div class="content">Stack ... <br/><br/>overflow ... </div>
...
我再嘗試來標記每posts
有:
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s',\
level=logging.INFO)
num_features = 100
min_word_count = 7
num_workers = 2
context = 5
downsampling = 1e-5
print "Training model..."
model = word2vec.Word2Vec(sentences, workers=num_workers, \
size=num_features, min_count = min_word_count, \
window = context, sample = downsampling)
model.init_sims(replace=True)
Word2Vec.load()
model_name = "what"
model.save(model_name)
print "finished"
然後我:
sentences=[]
for post in DF["posts"]:
sentences += utility.tosentences(post, tokenizer)
我用下面然後運行Word2Vec測試如下
model.doesnt_match("travel no Warning health".split())
但是,它根本沒有產生輸出
我不明白我上面得到的大輸出的含義。爲什麼這不起作用?