2013-06-20 120 views
1

需要繪製無監督矩形SOM模型的結果。附加要求:1)將每個節點繪製爲具有相應觀察類別的餅圖;圖表的大小應反映節點中的樣本數量。默認plot.kohonen不適合這種情況。無監督SOM可視化

回答

1

這是一個可能的解決方案。第一個函數som.prep.df由第二個'som.draw'調用,它只有兩個參數SOM模型和觀察到的訓練集類。

som.prep.df <- function(som.model, obs.classes, scaled) { 
    require(reshape2) 
    lev <- factor(wine.classes) 
    df <- data.frame(cbind(unit=som.model$unit.classif, class=as.integer(lev))) 
    # create table 
    df2 <- data.frame(table(df)) 
    df2 <- dcast(df2, unit ~ class, value.var="Freq") 
    df2$unit <- as.integer(df2$unit) 
    # calc sum 
    df2$sum <- rowSums(df2[,-1]) 
    # calc fraction borders of classes in each node 
    tmp <- data.frame(cbind(X0=rep(0,nrow(df2)), 
          t(apply(df2[,-1], 1, function(x) { 
          cumsum(x[1:(length(x)-1)])/x[length(x)] 
          })))) 
    df2 <- cbind(df2, tmp) 
    df2 <- melt(df2, id.vars=which(!grepl("^\\d$", colnames(df2)))) 
    df2 <- df2[,-ncol(df2)] 
    # define border for each classs in each node 
    tmp <- t(apply(df2, 1, function(x) { 
    c(x[paste0("X", as.character(as.integer(x["variable"])-1))], 
     x[paste0("X", as.character(x["variable"]))]) 
    })) 
    tmp <- data.frame(tmp, stringsAsFactors=FALSE) 
    tmp <- sapply(tmp, as.numeric) 
    colnames(tmp) <- c("ymin", "ymax") 
    df2 <- cbind(df2, tmp) 
    # scale size of pie charts 
    if (is.logical(scaled)) { 
    if (scaled) { 
     df2$xmax <- log2(df2$sum) 
    } else { 
     df2$xmax <- df2$sum 
    } 
    } 
    df2 <- df2[,c("unit", "variable", "ymin", "ymax", "xmax")] 
    colnames(df2) <- c("unit", "class", "ymin", "ymax", "xmax") 
    # replace classes with original levels names 
    df2$class <- levels(lev)[df2$class] 
    return(df2) 
} 

som.draw <- function(som.model, obs.classes, scaled=FALSE) { 
    # scaled - make or not a logarithmic scaling of the size of each node 
    require(ggplot2) 
    require(grid) 
    g <- som.model$grid 
    df <- som.prep.df(som.model, obs.classes, scaled) 
    df <- cbind(g$pts, df[,-1]) 
    df$class <- factor(df$class) 
    g <- ggplot(df, aes(fill=class, ymax=ymax, ymin=ymin, xmax=xmax, xmin=0)) + 
    geom_rect() + 
    coord_polar(theta="y") + 
    facet_wrap(x~y, ncol=g$xdim, nrow=g$ydim) + 
    theme(axis.ticks = element_blank(), 
      axis.text.y = element_blank(), 
      axis.text.x = element_blank(), 
      panel.margin = unit(0, "cm"), 
      strip.background = element_blank(), 
      strip.text = element_blank(), 
      plot.margin = unit(c(0,0,0,0), "cm"), 
      panel.background = element_blank(), 
      panel.grid = element_blank()) 
    return(g) 
} 

用法示例。

require(kohonen) 
data(wines) 
som.wines <- som(scale(wines), grid = somgrid(5, 5, "rectangular")) 

# Non-scaled map 
som.draw(som.wines, wine.classes) 

enter image description here

# Scaled map 
som.draw(som.wines, wine.classes, TRUE) 

enter image description here

這種功能也可用於監督模型的可視化,以及。但它只適用於矩形地圖。希望這會幫助某人。

有幾種可能的改進:

  1. 選擇比對數更好的縮放功能。因爲現在具有單個樣本的節點在縮放後變得不可見。
  2. 將圖例添加到將反映節點大小的整個繪圖。
  3. 或在每個圖表上添加有關節點數量的信息。

PS。代碼不是很優雅,所以任何建議和改進都是值得歡迎的。