1

我試圖圍繞每個質心繪製圓,並將半徑延伸到屬於每個羣集的最遠點。現在我的圓圈半徑從聚類中心訪問由KMeans羣集分組的數據的有效方式

這裏延伸到一點,在整個訓練數據集最遠的繪製是我的代碼:

def KMeansModel(n): 
    pca = PCA(n_components=2) 
    reduced_train_data = pca.fit_transform(train_data) 
    KM = KMeans(n_clusters=n) 
    KM.fit(reduced_train_data) 
    plt.plot(reduced_train_data[:, 0], reduced_train_data[:, 1], 'k.', markersize=2) 
    centroids = KM.cluster_centers_ 
    # Plot the centroids as a red X 
    plt.scatter(centroids[:, 0], centroids[:, 1], 
       marker='x', color='r') 
    for i in centroids: 
     print np.max(metrics.pairwise_distances(i, reduced_train_data)) 
     plt.gca().add_artist(plt.Circle(i, np.max(metrics.pairwise_distances(i, reduced_train_data)), fill=False)) 
    plt.show() 

out = [KMeansModel(n) for n in np.arange(1,16,1)] 

回答

2

當你

metrics.pairwise_distances(i, reduced_train_data) 

你計算所有訓練點的距離,而不僅僅是相關課程的訓練點。爲了找到對應於ind類訓練數據點的位置,你可以做

np.where(KM.labels_==ind)[0] 

因此,內部的for循環

for i in centroids: 

您需要訪問到培訓點來自特定班級。這將做的工作:

from sklearn.decomposition import PCA 
from sklearn.cluster import KMeans 
from sklearn import metrics 
import matplotlib.pyplot as plt 
import numpy as np 

def KMeansModel(n): 
    pca = PCA(n_components=2) 
    reduced_train_data = pca.fit_transform(train_data) 
    KM = KMeans(n_clusters=n) 
    KM.fit(reduced_train_data) 
    plt.plot(reduced_train_data[:, 0], reduced_train_data[:, 1], 'k.', markersize=2) 
    centroids = KM.cluster_centers_ 
    # Plot the centroids as a red X 
    plt.scatter(centroids[:, 0], centroids[:, 1], 
       marker='x', color='r') 
    for ind,i in enumerate(centroids): 
     class_inds=np.where(KM.labels_==ind)[0] 
     max_dist=np.max(metrics.pairwise_distances(i, reduced_train_data[class_inds])) 
     print(max_dist) 
     plt.gca().add_artist(plt.Circle(i, max_dist, fill=False)) 
    plt.show() 

out = [KMeansModel(n) for n in np.arange(1,16,1)] 

而這就是我得到使用的代碼的人物之一:

enter image description here