1
我試圖實現PCA,這對中間結果(如特徵值和特徵向量)運行良好。然而,當我嘗試將數據(3維)投影到二維主成分空間時,結果是錯誤的。 我花了很多時間我的代碼比較其他的實現,例如:Python PCA - 投影到較低空間空間
http://sebastianraschka.com/Articles/2014_pca_step_by_step.html
然而,很長一段時間後,就沒有進步,我不能發現其中的錯誤。由於正確的中間結果,我認爲這個問題是一個簡單的編碼錯誤。 預先感謝所有真正閱讀此問題的人,並且感謝那些提供有用評論/答案的人。
我的代碼如下:
import numpy as np
class PCA():
def __init__(self, X):
#center the data
X = X - X.mean(axis=0)
#calculate covariance matrix based on X where data points are represented in rows
C = np.cov(X, rowvar=False)
#get eigenvectors and eigenvalues
d,u = np.linalg.eigh(C)
#sort both eigenvectors and eigenvalues descending regarding the eigenvalue
#the output of np.linalg.eigh is sorted ascending, therefore both are turned around to reach a descending order
self.U = np.asarray(u).T[::-1]
self.D = d[::-1]
**problem starts here**
def project(self, X, m):
#use the top m eigenvectors with the highest eigenvalues for the transformation matrix
Z = np.dot(X,np.asmatrix(self.U[:m]).T)
return Z
我的代碼的結果是:
myresult
([[ 0.03463706, -2.65447128],
[-1.52656731, 0.20025725],
[-3.82672364, 0.88865609],
[ 2.22969475, 0.05126909],
[-1.56296316, -2.22932369],
[ 1.59059825, 0.63988429],
[ 0.62786254, -0.61449831],
[ 0.59657118, 0.51004927]])
correct result - such as by sklearn.PCA
([[ 0.26424835, -2.25344912],
[-1.29695602, 0.60127941],
[-3.59711235, 1.28967825],
[ 2.45930604, 0.45229125],
[-1.33335186, -1.82830153],
[ 1.82020954, 1.04090645],
[ 0.85747383, -0.21347615],
[ 0.82618248, 0.91107143]])
The input is defined as follows:
X = np.array([
[-2.133268233289599,0.903819474847349,2.217823388231679,-0.444779660856219,-0.661480010318842,-0.163814281248453,-0.608167714051449, 0.949391996219125],
[-1.273486742804804,-1.270450725314960,-2.873297536940942, 1.819616794091556,-2.617784834189455, 1.706200163080549,0.196983250752276,0.501491995499840],
[-0.935406638147949,0.298594472836292,1.520579082270122,-1.390457671168661,-1.180253547776717,-0.194988736923602,-0.645052874385757,-1.400566775105519]]).T