我是Python的新手,我試圖開發一個代碼,該代碼應該基於名爲Pycluster的預定義包執行K-Means集羣。一開始,我一直在使用固定數量的集羣(n = 10個集羣)進行集羣,代碼工作正常。我嘗試擴展一些代碼,以便不僅僅製作10個集羣,我試圖建立一個循環,將所需數量的集羣從2增加到10(或更多)。正如我所說,這個問題已經開始了,我對Python完全陌生。 我開發的代碼可以追溯到如下所示。我意識到錯誤從代碼行33到49開始。 我真的很感謝提供的任何幫助使代碼運行。在Python循環中更新和附加
# -*- coding: utf-8 -*-
"""
Created on Mon Oct 21 13:53:40 2013
@author: Engin
"""
from Pycluster import *
import numpy as np
#Open the text file containing the stored smart meter data
d=np.loadtxt("120-RES-195-Normalized.txt", delimiter="\t", skiprows=1, usecols=range(1,49))
handle=open("120-RES-195-Normalized.txt")
record = read(handle) #Store the smart meter data in an array called record.
cluster_results = np.ones((120, 11))
cluster_centroids=np.array([])
within_cluster_sum_of_squares=np.ones((1,11))
between_cluster_sum_of_squares=np.ones((1,11))
distance=[]
for n in range (1,11):
cluster_results[:,n-1], within_cluster_sum_of_squares[:,n-1], optimal_solution_repetition = record.kcluster(nclusters=n, npass=10, method='a', dist='e') #Performs the K-Means clustering using the defined parameters
centroids, cmask = record.clustercentroids(cluster_results[:,n-1], method='a', transpose=0) #Calculates the cluster centroids
cluster_centroids=np.append(cluster_centroids,centroids)
#The following routine stores the cluster numbers and the indices of the elements belonging to each
#cluster so that the Between Clusters Sum of Squares would be easily calculated. The results will also
#be easily visualised.
from collections import defaultdict
cluster_numbers_members = defaultdict(list)
for i,item in enumerate(cluster_results[:,n-1]):
cluster_numbers_members[item].append(i)
cluster_numbers_members = {k:v for k,v in cluster_numbers_members.items() if len(v)>=1}
cluster_members=cluster_numbers_members.values()
cluster_numbers=cluster_numbers_members.keys()
distance[:,n-1]=0
between_cluster_sum_of_squares[:,n-1]=0
for i in range(0,n):
for k in range(0,n):
distance[:,n-1] = record.clusterdistance(index1=cluster_members[i], index2=cluster_members[k], method='a', dist='e', transpose=0)
between_cluster_sum_of_squares[:,n-1]=between_cluster_sum_of_squares[:,n-1]+distance[:,n-1]
WCBCR = within_cluster_sum_of_squares/between_cluster_sum_of_squares
print cluster_results[:,n-1]
print within_cluster_sum_of_squares[:,n-1]
print cluster_centroids
#Arranging cluster centroids in (1X48) vector form
cluster_tuple=zip(*[iter(cluster_centroids)]*48)
cluster_array=numpy.array(list(cluster_tuple))
_ 「有啓動的問題,因爲正如我所說,我完全新的Python的。」 _請提供更多的細節。什麼樣的問題?你有錯誤信息嗎? – Kevin
嗨@Kevin,我更新了代碼,因爲我在變量名中有一些錯誤。在早期版本的代碼中,我使用了一些其他變量名稱,但必須重新命名它們才能使代碼更加清晰和一致。當我試圖運行當前(更新)代碼時,我不斷收到以下錯誤消息:distance [:,n-1] = 0 TypeError:列表索引必須是整數,而不是元組。在此先感謝您的幫助。 – user2470127