我正在查看skbio's
PCoA
方法(下面列出)的attributes
。我是API
的新手,我希望能夠獲得eigenvectors
和投影到新軸上的原始點,類似於的sklearn.decomposition.PCA
,因此我可以創建一些PC_1 vs PC_2
樣式的圖。我想出瞭如何獲得eigvals
和proportion_explained
,但features
返回爲None
。如何獲得`skbio` PCoA(主座標分析)結果?
這是因爲它在測試?
如果有任何教程使用它,那將不勝感激。我是scikit-learn
的巨大粉絲,並且希望開始使用更多的scikit's
產品。
| Attributes
| ----------
| short_method_name : str
| Abbreviated ordination method name.
| long_method_name : str
| Ordination method name.
| eigvals : pd.Series
| The resulting eigenvalues. The index corresponds to the ordination
| axis labels
| samples : pd.DataFrame
| The position of the samples in the ordination space, row-indexed by the
| sample id.
| features : pd.DataFrame
| The position of the features in the ordination space, row-indexed by
| the feature id.
| biplot_scores : pd.DataFrame
| Correlation coefficients of the samples with respect to the features.
| sample_constraints : pd.DataFrame
| Site constraints (linear combinations of constraining variables):
| coordinates of the sites in the space of the explanatory variables X.
| These are the fitted site scores
| proportion_explained : pd.Series
| Proportion explained by each of the dimensions in the ordination space.
| The index corresponds to the ordination axis labels
這裏是我的代碼來生成principal component analysis
對象。
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
from sklearn import decomposition
import seaborn as sns; sns.set_style("whitegrid", {'axes.grid' : False})
import skbio
from scipy.spatial import distance
%matplotlib inline
np.random.seed(0)
# Iris dataset
DF_data = pd.DataFrame(load_iris().data,
index = ["iris_%d" % i for i in range(load_iris().data.shape[0])],
columns = load_iris().feature_names)
n,m = DF_data.shape
# print(n,m)
# 150 4
Se_targets = pd.Series(load_iris().target,
index = ["iris_%d" % i for i in range(load_iris().data.shape[0])],
name = "Species")
# Scaling mean = 0, var = 1
DF_standard = pd.DataFrame(StandardScaler().fit_transform(DF_data),
index = DF_data.index,
columns = DF_data.columns)
# Distance Matrix
Ar_dist = distance.squareform(distance.pdist(DF_standard.T, metric="braycurtis")) # (m x m) distance measure
DM_dist = skbio.stats.distance.DistanceMatrix(Ar_dist, ids=DF_standard.columns)
PCoA = skbio.stats.ordination.pcoa(DM_dist)
我相信'.samples'什麼都沒有返回。我可以再試一次,我會確保我已更新了我的'skbio'。我一直在閱讀關於PCoA的資料,而且很多資源都很隱晦。就PCA而言,它是相同的步驟,而是距離矩陣而不是協方差矩陣的特徵分解? –
'.samples'是'pcoa'產生的'OrdinationResults'所必需的。如果你還沒有找到'None',你可以在[scikit-bio issue tracker](https://github.com/biocore/scikit-bio/issues)上發佈一個問題嗎?我的理解是,PCoA應用於距離矩陣,允許使用非歐幾里得距離度量,而PCA應用於特徵表並使用歐幾里德距離。因此,在歐幾里德距離矩陣上運行PCoA就相當於PCA。 [Here's](http://ordination.okstate.edu/overview.htm#Principal_coordinates_analysis)爲排序方法提供了有用的資源。 – jairideout
'DF = skbio.OrdinationResults(long_method_name =「TESTING」,short_method_name =「test」,eigvals = PCoA.eigvals,samples = DF_data) DF.samples'給我回到我未轉換的原始數據。我做錯了嗎? –