2016-10-19 127 views
1

我正在嘗試創建一個線性歧視分析(LDA)的雙標。我使用從這裏獲得的代碼的修改版本https://stats.stackexchange.com/questions/82497/can-the-scaling-values-in-a-linear-discriminant-analysis-lda-be-used-to-plot-eLDA貢獻雙標

但是,我有80個變量,使得雙標點極難讀取。由於它們的箭頭長度很長,而其餘的標籤在中間被縮小,所以這種變化的影響更大。 所以我試圖實現的是一個雙標圖,其中所有變量箭頭的長度相等,並且它們的相對貢獻(標度)以漸變顏色區分。 到目前爲止,我已經設法得到分級的顏色,但是我找不到使箭頭長度相同的方法。據我所知,geom_textgeom_segment使用LD1和LD2值來確定長度方向的箭頭。我怎樣才能覆蓋這個長度?

enter image description here

CODE:

library(ggplot2) 
library(grid) 
library(MASS) 
data(iris) 


iris.lda <- lda(as.factor(Species)~., 
       data=iris) 

#Project data on linear discriminants 
iris.lda.values <- predict(iris.lda, iris[,-5]) 

#Extract scaling for each predictor and 
data.lda <- data.frame(varnames=rownames(coef(iris.lda)), coef(iris.lda)) 

#coef(iris.lda) is equivalent to iris.lda$scaling 

data.lda$length <- with(data.lda, sqrt(LD1^2+LD2^2)) 

#Plot the results 
p <- qplot(data=data.frame(iris.lda.values$x), 
      main="LDA", 
      x=LD1, 
      y=LD2, 
      colour=iris$Species)+stat_ellipse(geom="polygon", alpha=.3, aes(fill=iris$Species)) 
p <- p + geom_hline(aes(yintercept=0), size=.2) + geom_vline(aes(xintercept=0), size=.2) 
p <- p + theme(legend.position="right") 
p <- p + geom_text(data=data.lda, 
        aes(x=LD1, y=LD2, 
         label=varnames, 
         shape=NULL, linetype=NULL, 
         alpha=length, position="identity"), 
        size = 4, vjust=.5, 
        hjust=0, color="red") 
p <- p + geom_segment(data=data.lda, 
         aes(x=0, y=0, 
          xend=LD1, yend=LD2, 
          shape=NULL, linetype=NULL, 
          alpha=length), 
         arrow=arrow(length=unit(0.1,"mm")), 
         color="red") 
p <- p + coord_flip() 

print(p) 

回答

1

怎麼這樣呢?我們必須做一些三角函數來使長度相等。請注意,相等性是以繪圖座標表示的,所以如果您想要以相同尺寸顯示,則需要添加coord_equal

(我清理你的繪製代碼,因爲它的很多是挺亂的。)

rad <- 3 # This sets the length of your lines. 
data.lda$length <- with(data.lda, sqrt(LD1^2+LD2^2)) 
data.lda$angle <- atan2(data.lda$LD1, data.lda$LD2) 
data.lda$x_start <- data.lda$y_start <- 0 
data.lda$x_end <- cos(data.lda$angle) * rad 
data.lda$y_end <- sin(data.lda$angle) * rad 

#Plot the results 
ggplot(cbind(iris, iris.lda.values$x), 
     aes(y = LD1, x = LD2, colour = Species)) + 
    stat_ellipse(aes(fill = Species), geom = "polygon", alpha = .3) + 
    geom_point() + 
    geom_hline(yintercept = 0, size = .2) + 
    geom_vline(xintercept = 0, size = .2) + 
    geom_text(aes(y = y_end, x = x_end, label = varnames, alpha = length), 
      data.lda, size = 4, vjust = .5, hjust = 0, colour = "red") + 
    geom_spoke(aes(x_start, y_start, angle = angle, alpha = length), data.lda, 
      color = "red", radius = rad, size = 1) + 
    ggtitle("LDA") + 
    theme(legend.position = "right") 

enter image description here

+1

真正令人稱奇!我永遠不會想出這個。謝謝! –