通過sklearn.decomposition.PCA表演svd，我怎麼能從這個得到U S V？

我執行SVD與sklearn.decomposition.PCA 通過sklearn.decomposition.PCA表演svd，我怎麼能從這個得到U S V？

從SVD

A = U的方程×S個X V_T

V_T =轉置V 的矩陣（對不起，我無法粘貼原始方程）

如果我想要矩陣U，S和V，如果我使用sklearn.decomposition.PCA，該如何得到它？

2017-04-04 Plafishy Phannakan

首先，根據矩陣的大小，PCA的sklearn實現不會總是計算完整的SVD分解。以下是從PCA's GitHub reciprocity採取：

svd_solver : string {'auto', 'full', 'arpack', 'randomized'} 
     auto : 
      the solver is selected by a default policy based on `X.shape` and 
      `n_components`: if the input data is larger than 500x500 and the 
      number of components to extract is lower than 80% of the smallest 
      dimension of the data, then the more efficient 'randomized' 
      method is enabled. Otherwise the exact full SVD is computed and 
      optionally truncated afterwards. 
     full : 
      run exact full SVD calling the standard LAPACK solver via 
      `scipy.linalg.svd` and select the components by postprocessing 
     arpack : 
      run SVD truncated to n_components calling ARPACK solver via 
      `scipy.sparse.linalg.svds`. It requires strictly 
      0 < n_components < X.shape[1] 
     randomized : 
      run randomized SVD by the method of Halko et al.

此外，還對數據（參見here）執行一些操作。

現在，如果你想得到用於sklearn.decomposition.PCA，你可以使用pca._fit(X)。例如：

from sklearn.decomposition import PCA 
X = np.array([[1, 2], [3,5], [8,10], [-1, 1], [5,6]]) 
pca = PCA(n_components=2) 
pca._fit(X)

打印

(array([[ -3.55731195e-01, 5.05615563e-01], 
     [ 2.88830295e-04, -3.68261259e-01], 
     [ 7.10884729e-01, -2.74708608e-01], 
     [ -5.68187889e-01, -4.43103380e-01], 
     [ 2.12745524e-01, 5.80457684e-01]]), 
array([ 9.950385 , 0.76800941]), 
array([[ 0.69988535, 0.71425521], 
     [ 0.71425521, -0.69988535]]))

但是，如果你只是想在原始數據的SVD分解，我會建議使用scipy.linalg.svd

來源

2017-04-04 16:12:36

我不熟悉Python，所以對我來說很難。還有一個問題是，如果數據量非常大，那麼'隨機'和'完整'之間更好。我應該爲我的數據集選擇什麼？「隨機」和「完整」之間的結果是不同的嗎？ –

一個選項只是不選擇，然後默認值爲'auto'，解算器將選擇使用哪種方法（基於矩陣大小和組件數量）。這個想法是，如果矩陣非常大，如果你不需要所有的組件（也就是說，你需要80％的最小維度的數據），然後做隨機svd更有效率，你不會太多這樣做。無論如何，如果您確實需要所有組件（或超過80％），那麼'自動'保證將使用完整的svd。隨機svd基於這篇論文：https：//arxiv.org/pdf/0909.4061.pdf –

通過sklearn.decomposition.PCA表演svd，我怎麼能從這個得到U S V？

回答

相關問題