2017-09-27 68 views
1

我想創建一個總結我的數據的圖片。數據是關於從不同國家的不同實踐中獲得的藥物使用流行情況。每種做法都有不同數量的數據,我想在我的照片中展示所有這些。閃避列ggplot2

下面是數據的一個子集上下工夫:

gr<-data.frame(matrix(0,36)) 
gr$drug<-c("a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b","b") 
gr$practice<-c("a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r") 
gr$country<-c("c1","c1","c1","c1","c1","c1","c1","c1","c1","c1","c2","c2","c2","c2","c2","c2","c3","c3","c1","c1","c1","c1","c1","c1","c1","c1","c1","c1","c2","c2","c2","c2","c2","c2","c3","c3") 
gr$prevalence<-c(9.14,5.53,16.74,1.93,8.51,14.96,18.90,11.18,15.00,20.10,24.56,22.29,19.41,20.25,25.01,25.87,29.33,20.76,18.94,24.60,26.51,13.37,23.84,21.82,23.69,20.56,30.53,16.66,28.71,23.83,21.16,24.66,26.42,27.38,32.46,25.34) 
gr$prop<-c(0.027,0.023,0.002,0.500,0.011,0.185,0.097,0.067,0.066,0.023,0.433,0.117,0.053,0.199,0.098,0.100,0.594,0.406,0.027,0.023,0.002,0.500,0.011,0.185,0.097,0.067,0.066,0.023,0.433,0.117,0.053,0.199,0.098,0.100,0.594,0.406) 
gr$low.CI<-c(8.27,4.80,12.35,1.83,7.22,14.53,18.25,10.56,14.28,18.76,24.25,21.72,18.62,19.83,24.36,25.22,28.80,20.20,17.73,23.15,21.06,13.12,21.79,21.32,22.99,19.76,29.60,15.41,28.39,23.25,20.34,24.20,25.76,26.72,31.92,24.73) 
gr$high.CI<-c(10.10,6.37,22.31,2.04,10.00,15.40,19.56,11.83,15.74,21.52,24.87,22.86,20.23,20.68,25.67,26.53,29.86,21.34,20.21,26.10,32.79,13.63,26.02,22.33,24.41,21.39,31.48,17.98,29.04,24.43,22.01,25.12,27.09,28.05,33.01,25.95) 

我寫的代碼是這樣的

p<-ggplot(data=gr, aes(x=factor(drug), y=as.numeric(gr$prevalence), ymax=max(high.CI),position="dodge",fill=practice,width=prop)) 
colour<-c(rep("gray79",10),rep("gray60",6),rep("gray39",2)) 
p + theme_bw()+ 
    geom_bar(stat="identity",position = position_dodge(0.9)) + 
    labs(x="Drug",y="Prevalence") + 
    geom_errorbar(ymax=gr$high.CI,ymin=gr$low.CI,position=position_dodge(0.9),width=0.25,size=0.25,colour="black",aes(x=factor(drug), y=as.numeric(gr$prevalence), fill=practice)) + 
    ggtitle("Drug usage by country and practice") + 
    scale_fill_manual(values = colour)+ guides(fill=F) 

我得到的數字是這個地方的酒吧都在各自的頂部其他雖然我想讓他們「閃避」。

enter image description here

我還獲得以下警告:未定義

YMAX:位置調整用ý代替 警告消息: position_dodge需要非重疊x上間隔

理想我想讓每個酒吧靠近彼此,他們的酒吧中間的錯誤酒吧,都由國家組織。

我也應該關心警告(我顯然不完全理解)?

我希望這是有道理的。我希望我足夠接近,但我似乎沒有去任何地方,一些幫助將不勝感激。

謝謝

回答

2

ggplot的geom_bar()接受寬度參數,但默認情況下不會將它們整齊排列在閃避位置。以下的解決方法引用解決方案here

library(dplyr) 

# calculate x-axis position for bars of varying width 
gr <- gr %>% 
    group_by(drug) %>% 
    arrange(practice) %>% 
    mutate(pos = 0.5 * (cumsum(prop) + cumsum(c(0, prop[-length(prop)])))) %>% 
    ungroup() 

x.labels <- gr$practice[gr$drug == "a"] 
x.pos <- gr$pos[gr$drug == "a"] 

ggplot(gr, 
     aes(x = pos, y = prevalence, 
      fill = country, width = prop, 
      ymin = low.CI, ymax = high.CI)) + 
    geom_col(col = "black") + 
    geom_errorbar(size = 0.25, colour = "black") + 
    facet_wrap(~drug) + 
    scale_fill_manual(values = c("c1" = "gray79", 
           "c2" = "gray60", 
           "c3" = "gray39"), 
        guide = F) + 
    scale_x_continuous(name = "Drug", 
        labels = x.labels, 
        breaks = x.pos) + 
    labs(title = "Drug usage by country and practice", y = "Prevalence") + 
    theme_classic() 

plot

+0

這正是我之後的事情!輝煌!之前沒有使用過dplyr,所以有一些學習要做!非常感謝你@Z.Lin! – MarcoD

0

有大量的信息,你正試圖在這裏傳達 - 在整個使用barplots,佔比例的國家對比藥物A和藥物B,您可以使用facet_grid功能。試試這個:

 colour<-c(rep("gray79",10),rep("gray60",6),rep("gray39",2)) 




     gr$drug <- paste("Drug", gr$drug) 
     p<-ggplot(data=gr, aes(x=factor(practice), y=as.numeric(prevalence), 
          ymax=high.CI,ymin = low.CI, 
          position="dodge",fill=practice, width=prop)) 


     p + theme_bw()+ facet_grid(drug~country, scales="free") + 
     geom_bar(stat="identity") + 
     labs(x="Practice",y="Prevalence") + 
     geom_errorbar(position=position_dodge(0.9), width=0.25,size=0.25,colour="black") + 
     ggtitle("Drug usage by country and practice") + 
     scale_fill_manual(values = colour)+ guides(fill=F) 

enter image description here

寬度太小,在C1的國家,如您指定的一個診所是相當的影響力。另外,您可以使用ggplot(aes(...))指定您的美學效果,而不必重置它,並且不需要在ggplot調用中的aes函數中包含數據框對象名稱。

+0

謝謝你,這是偉大的感謝!現在,他們並不是彼此相處,我可以看到它沒有做我想做的事 - 我的目標是解釋每種做法貢獻的數據量。例如,C1國家的第4列佔C1數據的50%,所以應該大得多,第3列只有0.2%,等等 - 因此,列gr $ prop是每次練習的數據比例來自該國的數據。那麼如何獲得列寬以反映gr $ prop中的值?如果不是,現在道歉是完全不同的問題。非常感謝您的幫助! – MarcoD

+0

我也看到這個解決方案沒有在p <-ggplot()中包含width = prop - 所以這可能是爲什麼?我試圖添加它,但它給出錯誤消息「錯誤:設置美學的不兼容的長度:ymax,顏色,大小,寬度,ymin」 – MarcoD

+0

對不起,關於那個Marco - 我的壞。我認爲我們可以繼續使用facet_grid,然後獲得所需的效果。我認爲發生的事情是寬度參數相對於x尺度來說太大,這就是導致數據堆疊在彼此之上的原因。我會修改我的答案HTH –