在scikit中使用PCA/LDA/MDS選擇最佳特徵數

我想要使用PCA，LDA和MDS來減少數據集的特徵。但我想保留95％的差異。在scikit中使用PCA/LDA/MDS選擇最佳特徵數

我找不到一種方法來指示各個算法的公式中所需的差異。

if n_components == ‘mle’, Minka’s MLE is used to guess the dimension if 0 < n_components < 1, select the number of components such that the amount of variance that needs to be explained is greater than the percentage specified by n_components

但如何才能n_components等於「MLE」，並在同一時間的一小部分 - 一個段落PCA的API（sklearn.decomposition.PCA）似乎有關？

設置n_components ='mle'將特徵從40減少到39，這沒有幫助。

來源

2014-12-28 goelakash

的PCA對象在sklearn.decomposition具有稱爲「explained_variance_ratio_」的屬性，它是給出總方差的百分比比率，每個主成分是負責的陣列，以遞減的訂購。

所以，你可以先創建一個PCA對象，以適應數據 -

import sklearn.decomposition.PCA as PCA 
pca_obj = PCA() 
x_trans = pca_obj.fit_transform(x)     // x is the data

現在，我們可以繼續添加方差百分比，直到，直到我們得到所需的值（在我的情況下，0.95） -

s = pca_obj.explained_variance_ratio_ 
sum=0.0 
comp=0 

for _ in s: 
    sum += _ 
    comp += 1 
    if(sum>=0.95): 
     break

所需的部件的數量將是排版

的值

來源

2014-12-28 13:00:41 goelakash

在scikit中使用PCA/LDA/MDS選擇最佳特徵數

回答

相關問題