2016-03-04 49 views
0

我試圖預測使用hmmlearn庫給出一些數據的最佳序列,但是我收到一個錯誤。我的代碼是:將數據擬合到hmm.MultinomialHMM

from hmmlearn import hmm 
trans_mat = np.array([[0.2,0.6,0.2],[0.4,0.0,0.6],[0.1,0.2,0.7]]) 
emm_mat = np.array([[0.2,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1],[0.1,0.1,0.1,0.1,0.2,0.1,0.1,0.1,0.1],[0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.2]]) 
start_prob = np.array([0.3,0.4,0.3]) 
X = [3,4,5,6,7] 
model = GaussianHMM(n_components = 3, n_iter = 1000) 
X = np.array(X) 
model.startprob_ = start_prob 
model.transmat_ = trans_mat 
model.emissionprob_ = emm_mat 

# Predict the optimal sequence of internal hidden state 
x = model.fit([X]) 

print(model.decode([X])) 

,但我得到一個錯誤說:

Traceback (most recent call last): 
    File "hmm_loyalty.py", line 55, in <module> 
    x = model.fit([X]) 
    File "build/bdist.macosx-10.6-x86_64/egg/hmmlearn/base.py", line 421, in fit 
    File "build/bdist.macosx-10.6-x86_64/egg/hmmlearn/hmm.py", line 183, in _init 
    File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/cluster/k_means_.py", line 785, in fit 
    X = self._check_fit_data(X) 
    File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/cluster/k_means_.py", line 758, in _check_fit_data 
X.shape[0], self.n_clusters)) 
ValueError: n_samples=1 should be >= n_clusters=3 

任何人有任何想法,這是什麼意思和我能做些什麼來解決它?

回答

3

是有一些問題與您的代碼:

  1. model是一個GaussianHMM。你可能想要MultinomialHMM
  2. 輸入X的形狀不正確。對於MultinomialHMM X必須具有形狀(n_samples, 1),因爲觀察結果是一維的。
  3. 你不想要fit,除非需要估計一些模型參數,這裏不是這種情況。

這裏有一個工作版本

import numpy as np 
from hmmlearn import hmm 

model = hmm.MultinomialHMM(n_components=3) 
model.startprob_ = np.array([0.3, 0.4, 0.3]) 
model.transmat_ = np.array([[0.2, 0.6, 0.2], 
          [0.4, 0.0, 0.6], 
          [0.1, 0.2, 0.7]]) 
model.emissionprob_ = np.array([[0.2, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1], 
           [0.1, 0.1, 0.1, 0.1, 0.2, 0.1, 0.1, 0.1, 0.1], 
           [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.2]]) 

# Predict the optimal sequence of internal hidden state 
X = np.atleast_2d([3, 4, 5, 6, 7]).T 
print(model.decode(X)) 
+0

就像一個快速跟進,爲什麼我們拍X的轉置? – lordingtar

+1

因爲在'np.atleast_2d'後面,'X'的形狀是'(1,n_samples)'。 –

+0

假設我沒有設置模型參數,我將如何調用擬合函數?它說我需要(n_samples,1),但上面的X形狀對我不起作用。它仍然說ValueError:來自多項分佈的預期樣本 – lordingtar