使用組件參數進行的GaussianMixture初始化 - sklearn

我想使用sklearn.mixture.GaussianMixture來存儲高斯混合模型，這樣我就可以稍後使用它來使用score_samples方法在採樣點生成採樣或值。這裏是一個例子，其中部件具有以下重量，平均和協方差使用組件參數進行的GaussianMixture初始化 - sklearn

import numpy as np 
weights = np.array([0.6322941277066596, 0.3677058722933399]) 
mu = np.array([[0.9148052872961359, 1.9792961751316835], 
       [-1.0917396392992502, -0.9304220945910037]]) 
sigma = np.array([[[2.267889129267119, 0.6553245618368836], 
         [0.6553245618368835, 0.6571014653342457]], 
         [[0.9516607767206848, -0.7445831474157608], 
         [-0.7445831474157608, 1.006599716443763]]])

然後我初始化的混合物作爲遵循

from sklearn import mixture 
gmix = mixture.GaussianMixture(n_components=2, covariance_type='full') 
gmix.weights_ = weights # mixture weights (n_components,) 
gmix.means_ = mu   # mixture means (n_components, 2) 
gmix.covariances_ = sigma # mixture cov (n_components, 2, 2)

最後，我試圖基於這導致了參數的樣品一個錯誤：

x = gmix.sample(1000) 
NotFittedError: This GaussianMixture instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

據我所知GaussianMixture意欲使用的高斯的混合物，以適應的樣品，但有沒有向它提供的最終值和共同的方式ntinue從那裏？

來源

2017-02-22 hashmuke

首先，您需要將數據輸入到模型中進行訓練，然後才能生成隨機樣本。請參閱[示例文檔（）]（http://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html#sklearn.mixture.GaussianMixture.sample） –

我沒有初始數據我擁有的是每個組件的參數。我正在尋找解決方法或替代python庫。 – hashmuke

你搖滾，J.P.Petersen！看到您的回答後，我比較了使用fit方法引入的更改。看起來最初的實例化不會創建gmix的所有屬性。特別是它缺少以下屬性，

covariances_ 
means_ 
weights_ 
converged_ 
lower_bound_ 
n_iter_ 
precisions_ 
precisions_cholesky_

前三個是在給定輸入被分配時引入的。其餘，對於我的應用程序，我需要的唯一屬性是precisions_cholesky_這是逆covarinace矩陣的cholesky分解。作爲最低要求我增加一條，作爲後續，

gmix.precisions_cholesky_ = np.linalg.cholesky(np.linalg.inv(sigma)).transpose((0, 2, 1))

來源

2017-02-22 17:33:32 hashmuke

它似乎有一個檢查，確保模型已經過培訓。在設置參數之前，您可以通過在非常小的數據集上訓練GMM來欺騙它。像這樣：

gmix = mixture.GaussianMixture(n_components=2, covariance_type='full') 
gmix.fit(rand(10, 2)) # Now it thinks it is trained 
gmix.weights_ = weights # mixture weights (n_components,) 
gmix.means_ = mu   # mixture means (n_components, 2) 
gmix.covariances_ = sigma # mixture cov (n_components, 2, 2) 
x = gmix.sample(1000) # Should work now

來源

2017-02-22 15:29:22

要明白髮生了什麼，有什麼GaussianMixture第一checks that it has been fitted：

self._check_is_fitted()

觸發the following check：

def _check_is_fitted(self): 
    check_is_fitted(self, ['weights_', 'means_', 'precisions_cholesky_'])

最後的last function call ：

def check_is_fitted(estimator, attributes, msg=None, all_or_any=all):

它只檢查分類器已經具有屬性。

因此，在短期，你唯一缺少的東西有它的工作（而不必fit它）是設置precisions_cholesky_屬性：

gmix.precisions_cholesky_ = 0

應該做的伎倆（不能嘗試它所以不是100％肯定的：P）

但是，如果你想打安全，有一個一致的解決方案的情況下，scikit學習更新其contrains，@JPPetersen的解決方案可能是最好的一段路要走。

來源

2017-02-22 19:05:16

雅解釋的東西，我最初綁定分配'gmix.precisions_cholesky_ = None'，因爲我能夠生成樣本。然而，如果你正在調用'score_samples'，那麼這將不起作用，它希望該值是一個尺寸類似於協方差的numpy數組。 – hashmuke

使用組件參數進行的GaussianMixture初始化 - sklearn

回答

相關問題