2017-01-27 27 views
0

我正在pymc3中實現隱馬爾可夫鏈。在實現隱藏狀態方面我已經做得相當好了。下面,我展示了一個簡單的2態Markov鏈:可以使用什麼pymc3蒙特卡羅步進機進行自定義分類分配?

import numpy as np 
import pymc3 as pm 
import theano.tensor as tt 

# Markov chain sample with 2 states that was created 
# to have prob 0->1 = 0.1 and prob 1->0 = 0.3 
sample = np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 
    0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 
    1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 
    0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0], 
dtype=np.uint8) 

我現在正在定義一個描述狀態的類。作爲輸入,我需要知道P1從狀態0移動到狀態1的概率P1,並且P2從1-> 0移動。我還需要知道的概率PA的第一狀態是0

class HMMStates(pm.Discrete): 
    """ 
    Hidden Markov Model States 
    Parameters 
    ---------- 
    P1 : tensor 
     probability to remain in state 1 
    P2 : tensor 
     probability to move from state 2 to state 1 

    """ 

    def __init__(self, PA=None, P1=None, P2=None, 
       *args, **kwargs): 
     super(HMMStates, self).__init__(*args, **kwargs) 
     self.PA = PA 
     self.P1 = P1 
     self.P2 = P2 
     self.mean = 0. 
     self.mode = tt.cast(0,dtype='int64') 

    def logp(self, x): 
     PA = self.PA 
     P1 = self.P1 
     P2 = self.P2 

     # now we need to create an array with probabilities 
     # so that for x=A: PA=P1, PB=(1-P1) 
     # and for x=B: PA=P2, PB=(1-P2) 

     choice = tt.stack((P1,P2)) 
     P = choice[x[:-1]] 

     x_i = x[1:] 

     ou_like = pm.Categorical.dist(P).logp(x_i) 
     return pm.Categorical.dist(PA).logp(x[0]) + tt.sum(ou_like) 

我非常自豪,我對谷歌theano瞭解到集團的高級索引忍者的技巧。你也可以用tt.switch來實現。我不太確定的是self.mode。我只是給了它0來避免測試錯誤。以下是如何在模型中使用該類來測試它是否有效。在這種情況下,狀態不是隱藏的,而是被觀察到的。

with pm.Model() as model: 
    # 2 state model 
    # P1 is probablility to stay in state 1 
    # P2 is probability to move from state 2 to state 1 
    P1 = pm.Dirichlet('P1', a=np.ones(2)) 
    P2 = pm.Dirichlet('P2', a=np.ones(2)) 

    PA = pm.Deterministic('PA',P2/(P2+1-P1)) 

    states = HMMStates('states',PA,P1,P2, observed=sample) 

    start = pm.find_MAP() 

    trace = pm.sample(5000, start=start) 

輸出很好地重現了數據。在下一個模型中,我會展示這個問題。這裏我不直接觀察狀態,而是添加了一些高斯噪聲的狀態(隱藏狀態)。如果您使用Metropolis步進機運行模型,那麼它會崩潰並顯示索引錯誤,我追溯到在分類分佈上使用Metropolis步進機的相關問題。不幸的是,適用於我的課程的唯一步進器是CategoricalGibbsMetropolis步進器,但它拒絕與我的課程一起工作,因爲它不是明確的Categorial Distribution。

gauss_sample = sample*1.0 + 0.1*np.random.randn(len(sample)) 
from scipy import optimize 
with pm.Model() as model2: 
    # 2 state model 
    # P1 is probablility to stay in state 1 
    # P2 is probability to move from state 2 to state 1 
    P1 = pm.Dirichlet('P1', a=np.ones(2)) 
    P2 = pm.Dirichlet('P2', a=np.ones(2)) 

    S = pm.InverseGamma('S',alpha=2.1, beta=1.1) 

    PA = pm.Deterministic('PA',P2/(P2+1-P1)) 

    states = HMMStates('states',PA,P1,P2, shape=len(gauss_sample)) 

    emission = pm.Normal('emission', 
         mu=tt.cast(states,dtype='float64'), 
         sd=S, 
         observed = gauss_sample) 

    start2 = pm.find_MAP(fmin=optimize.fmin_powell) 
    step1 = pm.Metropolis(vars=[P1, P2, S, PA, emission]) 
    step2 = pm.ElemwiseCategorical(vars=[states], values=[0,1]) 
    trace2 = pm.sample(10000, start=start, step=[step1,step2]) 

ElemwiseCategorical使它運行,但沒有爲我的狀態分配正確的值。這些州要麼全是0,要麼全是1。

我該如何告訴ElemwiseCategorial分配一個1s和0s狀態的向量,或者我怎樣才能讓CategorialGibbsMetropolis將我的分佈識別爲分類。這必定是自定義分發的常見問題。

+0

給我的問題更新。昨天,我侵入了我的pymc3發行版並刪除了CategoricalGibbsMetropolis中的代碼,這些代碼測試父分發是否爲分類。現在步進器與我的HMMStates類一起工作。我將在pymc3 github上發佈一條建議,讓CategoricalGibbsMetropolis步進器允許分類類。歡迎其他建議。 –

回答

1

由於我沒有聽到任何人對我的問題,我自己回答。我使用的技巧是由Chris Fonnesbeck在pymc3 github上提出的,我在這裏打開了這個問題。他建議將pm.Categorical分類。

class HMMStates(pm.Categorical): 
    """ 
    Hidden Markov Model States 
    Parameters 
    ---------- 
    P1 : tensor 
     probability to remain in state 1 
    P2 : tensor 
     probability to move from state 2 to state 1 

    """ 

    def __init__(self, PA=None, P1=None, P2=None, 
       *args, **kwargs): 
     super(pm.Categorical, self).__init__(*args, **kwargs) 
     self.PA = PA 
     self.P1 = P1 
     self.P2 = P2 
     self.k = 2 # two state model 
     self.mean = 0. 
     self.mode = tt.cast(0,dtype='int64') 

    def logp(self, x): 
     PA = self.PA 
     P1 = self.P1 
     P2 = self.P2 

     # now we need to create an array with probabilities 
     # so that for x=A: PA=P1, PB=(1-P1) 
     # and for x=B: PA=P2, PB=(1-P2) 
     PT = tt.stack((P1,P2)) 

     P = PT[x[:-1]] 

     x_i = x[1:] 

     ou_like = pm.Categorical.dist(P, shape=(N_chain-1,2)).logp(x_i) 
     return pm.Categorical.dist(PA).logp(x[0]) + tt.sum(ou_like) 

我HMMStates不能真正稱之爲pm.Categorical超級初始化,所以我調用父類pm.Categorical的是pm.Discrete。這一訣竅使其通過了BinaryGibbsMetropolis和CategoricalGibbsMetropolis的測試。

如果您有興趣實施2狀態和多狀態HMM,我將執行所有這些情況here

相關問題