2016-09-29 93 views
2

我想使用三種不同的聚類算法執行一些聚類分析。我的數據從標準輸入加載如下sklean fit_predict不接受2維numpy數組

import sklearn.cluster as cluster 

X = [] 
for line in sys.stdin: 
    x1, x2 = line.strip().split() 
    X.append([float(x1), float(x2)]) 
X = numpy.array(X) 

,然後在陣列中存儲我的羣集參數和類型,這樣

clustering_configs = [ 
    ### K-Means 
    ['KMeans', {'n_clusters' : 5}], 
    ### Ward 
    ['AgglomerativeClustering', { 
       'n_clusters' : 5, 
       'linkage' : 'ward' 
       }], 
    ### DBSCAN 
    ['DBSCAN', {'eps' : 0.15}] 
] 

,我試圖打電話給他們在for循環中

for alg_name, alg_params in clustering_configs: 

    class_ = getattr(cluster, alg_name) 
    instance_ = class_(alg_params) 

    instance_.fit_predict(X) 

除了instance_.fit_prefict(X)函數以外,一切正常。我正在返回一個錯誤

Traceback (most recent call last): 
    File "meta_cluster.py", line 47, in <module> 
    instance_.fit_predict(X) 
    File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_means_.py", line 830, in fit_predict 
    return self.fit(X).labels_ 
    File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_means_.py", line 812, in fit 
    X = self._check_fit_data(X) 
    File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.17.1-py2.7-linux-x86_64.egg/sklearn/cluster/k_means_.py", line 789, in _check_fit_data 
    X.shape[0], self.n_clusters)) 
TypeError: %d format: a number is required, not dict 

任何人都有線索,我可能會出錯?我讀了sklearn文檔here,它聲稱你只需要一個array-like or sparse matrix, shape=(n_samples, n_features),我相信我有。

有什麼建議嗎?謝謝!

回答

2
class sklearn.cluster.KMeans(n_clusters=8, init='k-means++', n_init=10, max_iter=300, tol=0.0001, precompute_distances='auto', verbose=0, random_state=None, copy_x=True, n_jobs=1, algorithm='auto')[source] 

他們的方式,你會打電話的K均值類,

KMeans(n_clusters=5) 

根據您目前的代碼,你在呼喚

KMeans({'n_clusters': 5}) 

,這是造成alg_params作爲一個快譯通,而不是傳遞的類參數。其他算法也一樣。

+0

有沒有一種簡單的方法可以將這些值從字典中轉化爲必要的格式? – wKavey

+2

@wKavey:'KMeans(** {'n_clusters':5})' –

+0

所以在我的例子中'instance_ = class _(** alg_params)'? – wKavey