2014-02-25 68 views
0

爲了簡化我的問題,我在這裏創建了一個虛擬問題:我有兩組訓練數據分別標記爲1和2。假設兩個訓練數據集均遵循高斯分佈的混合。我可以很容易地使用Matlab工具箱函數(gmdistribution.fit)來估計它們的均值和協方差。如何使用高斯模型的混合來獲得可能性

然後,我有一些測試數據集,假設使用類似於訓練數據集2的MoG創建,但帶有噪聲。我想計算一些類似於似然概率的東西,即我的測試數據集更有可能使用訓練數據集2的MoG生成。換句話說,我想讓我的測試數據集具有標籤2的可能性。

請您指出一個方向如何做到這一點?非常感謝。

N.B:

  1. 大小我的兩個訓練數據集的不同
  2. 兩個訓練數據集的分佈是重疊
  3. 測試數據集的大小比訓練數據集小

一些Matlab的代碼:

%% Mixture of Gassian 1 (Training set 1) 
mean1         = [1 -2]; 
cov1         = [2 0; 0 .5]; 
mean2         = [0.5 -5]; 
cov2         = [1 0; 0 1]; 
trainingDataset1      = [mvnrnd(mean1, cov1, 1000); mvnrnd(mean2, cov2, 1000)]; 

MoGOptions        = statset('Display', 'final'); 
MoGObj1         = gmdistribution.fit(trainingDataset1, 2, 'Options', MoGOptions); 

figure, 
scatter(trainingDataset1(:,1), trainingDataset1(:,2), 10, '.') 
hold on 
ezcontour(@(x,y)pdf(MoGObj1,[x y]), [-8 6], [-8 2]); 

%% Mixture of Gassian 2 (Training set 2) 
mean4         = [0.5 -1]; 
cov4         = [1.5 0; 0 .8]; 
mean5         = [-2 -3]; 
cov5         = [1 0; 0 1]; 
mean6         = [-4 -2]; 
cov6         = [1 0; 0 1]; 
trainingDataset2      = [mvnrnd(mean4, cov4, 500); mvnrnd(mean5, cov5, 500); mvnrnd(mean6, cov6, 500)]; 

MoGOptions        = statset('Display', 'final'); 
MoGObj2         = gmdistribution.fit(trainingDataset2, 2, 'Options', MoGOptions); 

figure, 
scatter(trainingDataset2(:,1), trainingDataset2(:,2), 10, '.') 
hold on 
ezcontour(@(x,y)pdf(MoGObj2,[x y]), [-8 6], [-8 2]); 

%% Test set 
mean7         = [1.1 -2.1]; 
cov7         = [2.2 0; 0 .4]; 
mean8         = [0.3 -5.4]; 
cov8         = [1.2 0; 0 1.1]; 
testingDataset1       = [mvnrnd(mean7, cov7, 100); mvnrnd(mean8, cov8, 100)]; 

figure, 
scatter(testingDataset1(:,1), testingDataset1(:,2), 10, '.') 

回答

相關問題