未能在gensim中修復LDA模型中的種子值

當使用LDA模型時，每次都會得到不同的主題，我想要複製同一組。我在Google搜索了類似的問題，例如this。未能在gensim中修復LDA模型中的種子值

我按照num.random.seed(1000)的文章所述修復了種子，但它不起作用。我讀了ldamodel.py，找到下面的代碼：

def get_random_state(seed): 

    """ 
    Turn seed into a np.random.RandomState instance. 
    Method originally from maciejkula/glove-python, and written by @joshloyal 
    """ 
    if seed is None or seed is numpy.random: 
     return numpy.random.mtrand._rand 
    if isinstance(seed, (numbers.Integral, numpy.integer)): 
     return numpy.random.RandomState(seed) 
    if isinstance(seed, numpy.random.RandomState): 
     return seed 
    raise ValueError('%r cannot be used to seed a numpy.random.RandomState' 
         ' instance' % seed)

所以我使用的代碼：

lda = models.LdaModel(
    corpus_tfidf, 
    id2word=dic, 
    num_topics=2, 
    random_state=numpy.random.RandomState(10) 
)

但它仍然沒有工作。

來源

2016-09-21 Marcel.Shen

由corpora.Dictionary生成的字典可能與相同的語料庫不同（例如相同的詞語，但順序不同）。因此，每個人都應該修正詞典以及種子以獲得相同的主題。下面的代碼可能有助於修復詞典：

dic = corpora.Dictionary(corpus) 
dic.save("filename") 
dic=corpora.Dictionary.load("filename")

來源

2016-09-23 01:55:03

未能在gensim中修復LDA模型中的種子值

回答

相關問題