2012-09-16 53 views
5

在Weka中使用Kmeans時,可以在模型的結果輸出上調用getAssignments()以獲得每個給定實例的集羣分配。這裏有一個(截斷的)Jython例子:在Weka中以編程方式獲取Xmeans clusterer輸出

>>>import weka.clusterers.SimpleKMeans as kmeans 
>>>kmeans.buildClusterer(data) 
>>>assignments = kmeans.getAssignments() 
>>>assignments 
>>>array('i',[14, 16, 0, 0, 0, 0, 16,...]) 

每個簇號的索引對應於實例。因此,實例0位於簇14中,實例1位於簇16中,依此類推。

我的問題是:是否有類似的Xmeans?我已經瀏覽了整個API here,沒有看到類似的東西。

回答

7

下面是來自Weka的羣發我的問題的答覆:因爲

>>> import java.io.FileReader as FileReader 
>>> import weka.core.Instances as Instances 
>>> import weka.clusterers.XMeans as xmeans 
>>> import java.io.BufferedReader as read 
>>> import java.io.FileReader 
>>> import java.io.File 
>>> read = read(FileReader("some arff file")) 
>>> data = Instances(read) 
>>> file = FileReader("some arff file") 
>>> data = Instances(file) 
>>> xmeans = xmeans() 
>>> xmeans.setMaxNumClusters(100) 
>>> xmeans.setMinNumClusters(2) 
>>> xmeans.buildClusterer(data)# here's our model 
>>> enumerated_instances = data.enumerateInstances() #get the index of each instance 
>>> for index, instance in enumerate(enumerated_instances): 
     cluster_num = xmeans.clusterInstance(instance) #pass each instance through the model 
     print "instance # ",index,"is in cluster ", cluster_num #pretty print results 

instance # 0 is in cluster 1 
instance # 1 is in cluster 1 
instance # 2 is in cluster 0 
instance # 3 is in cluster 0 

我要離開這一切了作爲參考,:

"Not as such. But all clusterers have a clusterInstance() method. You can 
pass each training instance through the trained clustering model to 
obtain the cluster index for each." 

這裏是我的Jython實現這個建議的可以使用相同的方法來獲取Weka的任何聚類器的結果的聚類分配。

相關問題