2017-07-24 65 views
0

我試圖從使用clValid創建的R羣集驗證對象中提取驗證措施。將羣集摘要對象轉換爲數據幀

當我創建對象和打印完整的總結,我用的這個下面

library(clValid) 

x <- clValid(iris[, -5], nClust=2:10, 
     clMethods=c('hierarchical'), validation='internal') 
summary(x) 

輸出是:

Clustering Methods: 
hierarchical 

Cluster sizes: 
2 3 4 5 6 7 8 9 10 

Validation Measures: 
           2  3  4  5  6  7  8  9  10 

hierarchical Connectivity 0.0000 4.4770 8.9929 15.4893 18.4183 24.8464 29.8425 36.8567 39.5607 
      Dunn   0.3389 0.1378 0.1540 0.1540 0.1668 0.1624 0.1624 0.1915 0.1915 
      Silhouette  0.6867 0.5542 0.4720 0.4307 0.3420 0.3707 0.3659 0.3167 0.3083 

Optimal Scores: 

      Score Method  Clusters 
Connectivity 0.0000 hierarchical 2  
Dunn   0.3389 hierarchical 2  
Silhouette 0.6867 hierarchical 2  

需要的輸出

我想得到Validation Measures作爲這樣的數據幀:

       2  3  4  5  6  7  8  9  10 

hierarchical Connectivity 0.0000 4.4770 8.9929 15.4893 18.4183 24.8464 29.8425 36.8567 39.5607 
      Dunn   0.3389 0.1378 0.1540 0.1540 0.1668 0.1624 0.1624 0.1915 0.1915 
      Silhouette  0.6867 0.5542 0.4720 0.4307 0.3420 0.3707 0.3659 0.3167 0.3083 

嘗試

當我使用:

names(summary(x)) 
attributes(summary(x)) 

這些都給予

NULL 

我可以使用optimalScores(x)最佳成績,但是,這並不validationMeasures(x)工作。

問題

有沒有一種方法來提取Validation Measures從本摘要對象data.frame

回答

3

首先,你應該總是嘗試

str(x) 
Formal class 'clValid' [package "clValid"] with 14 slots 
    [email protected] clusterObjs:List of 1 
    .. ..$ hierarchical:List of 7 
    .. .. ..$ merge  : int [1:149, 1:2] -102 -8 -1 -10 -129 -11 -5 -20 -30 -58 ... 
    .. .. ..$ height  : num [1:149] 0 0.1 0.1 0.1 0.1 ... 
    .. .. ..$ order  : int [1:150] 42 15 16 33 34 37 21 32 44 24 ... 
    .. .. ..$ labels  : NULL 
    .. .. ..$ method  : chr "average" 
    .. .. ..$ call  : language hclust(d = Dist, method = method) 
    .. .. ..$ dist.method: chr "euclidean" 
    .. .. ..- attr(*, "class")= chr "hclust" 
    [email protected] measures : num [1:3, 1:9, 1] 0 0.339 0.687 4.477 0.138 ... 
    .. ..- attr(*, "dimnames")=List of 3 
    .. .. ..$ : chr [1:3] "Connectivity" "Dunn" "Silhouette" 
    .. .. ..$ : chr [1:9] "2" "3" "4" "5" ... 
    .. .. ..$ : chr "hierarchical" 
    [email protected] measNames : chr [1:3] "Connectivity" "Dunn" "Silhouette" 
    [email protected] clMethods : chr "hierarchical" 
    [email protected] labels  : chr [1:150] "1" "2" "3" "4" ... 
    [email protected] nClust  : num [1:9] 2 3 4 5 6 7 8 9 10 
    [email protected] validation : chr "internal" 
    [email protected] metric  : chr "euclidean" 
    [email protected] method  : chr "average" 
    [email protected] neighbSize : num 10 
    [email protected] annotation : NULL 
    [email protected] GOcategory : chr "all" 
    [email protected] goTermFreq : num 0.05 
    [email protected] call  : language clValid(obj = iris[, -5], nClust = 2:10, clMethods = c("hierarchical"),  validation = "internal") 

因此,我們可以看到這個包使用,並返回S4對象,該插槽,measures之一,似乎是你想要的。

[email protected][,,"hierarchical"] 
        2   3   4   5   6   7 
Connectivity 0.0000000 4.4769841 8.9928571 15.4892857 18.4182540 24.8464286 
Dunn   0.3389087 0.1378257 0.1540416 0.1540416 0.1668323 0.1624158 
Silhouette 0.6867351 0.5541609 0.4719936 0.4306700 0.3419904 0.3707424 
         8   9   10 
Connectivity 29.8424603 36.8567460 39.5607143 
Dunn   0.1624158 0.1914854 0.1914854 
Silhouette 0.3658753 0.3166807 0.3082851 
+0

謝謝。我根本不知道「str()」。這看起來非常有用。我真的很困難。再次感謝。 –