2017-06-13 64 views
0

我有數據框df,有46行和3列。基於一個因子水平的值對ggplot2圖進行重新排序

我想通過program_ID變量創建一個youth_activity_rc變量的值的圖,如此代碼/圖。 。 。

library(ggplot2) 
ggplot(df, aes(x = program_name, y = total_minutes_p, group = youth_activity_rc, fill = youth_activity_rc)) + 
    geom_col(position = position_stack(reverse = T)) + 
    coord_flip() 

geom_col figure

。 。 。但隨着program_ID變量重新排序的youth_activity_rc變量Not Focused因子水平值的基礎上:

有許多的演示如何做到這一點的單一變量的基礎上的問題(即this question),但根據與某個因素水平相關的價值(本例中爲Not Focused),我無法找到這樣的結果;這似乎很簡單,但至少根據其他答案中推薦的解決方案(即使用stats::reorder()dplyr::arrange()),它不是。

的數據是在這裏:

df <- structure(list(program_ID = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 
5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 
8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L), .Label = c("1", "2", "4", "5", 
"6", "7", "8", "9", "10"), class = "factor"), youth_activity_rc = structure(c(2L, 
6L, 5L, 1L, 3L, 2L, 6L, 1L, 3L, 2L, 6L, 5L, 1L, 3L, 2L, 6L, 4L, 
5L, 1L, 3L, 2L, 6L, 5L, 1L, 3L, 2L, 6L, 1L, 3L, 2L, 6L, 4L, 1L, 
3L, 2L, 6L, 4L, 5L, 1L, 3L, 2L, 6L, 4L, 5L, 1L, 3L), .Label = c("Not Focused", 
"Basic Skills Activity", "Program Staff Led", "Field Trip Speaker", 
"Lab Activity", "Creating Product"), class = "factor"), total_minutes_p = c(0.248, 
0.116, 0.075, 0.458, 0.103, 0.466, 0.015, 0.202, 0.317, 0.248, 
0.263, 0.006, 0.372, 0.111, 0.183, 0.172, 0.088, 0.048, 0.305, 
0.203, 0.157, 0.066, 0.079, 0.592, 0.106, 0.128, 0.423, 0.423, 
0.026, 0.176, 0.233, 0.125, 0.426, 0.04, 0.164, 0.188, 0.046, 
0.007, 0.524, 0.072, 0.163, 0.112, 0.013, 0.021, 0.567, 0.124 
)), .Names = c("program_ID", "youth_activity_rc", "total_minutes_p" 
), row.names = c(NA, -46L), vars = "program_ID", labels = structure(list(
    program_ID = c(1, 2, 4, 5, 6, 7, 8, 9, 10)), .Names = "program_ID", row.names = c(NA, 
-9L), class = "data.frame", vars = "program_ID", drop = TRUE), indices = list(
    0:4, 5:8, 9:13, 14:19, 20:24, 25:28, 29:33, 34:39, 40:45), drop = TRUE, group_sizes = c(5L, 
4L, 5L, 6L, 5L, 4L, 5L, 6L, 6L), biggest_group_size = 6L, class = c("grouped_df", 
"tbl_df", "tbl", "data.frame")) 

回答

1

通過youth_activity_rctotal_minutes_p訂購你的數據集,然後使用fct_inorder從包forcats之前繪製一個選項。

fct_inorder以係數在數據集中出現的順序設置因子的級別,這就是爲什麼按照所需順序排序數據集以獲得program_ID級別的原因。

library(dplyr) 
library(forcats) 

df2 = df %>% 
    ungroup() %>% 
    arrange(youth_activity_rc, total_minutes_p) %>% 
    mutate(program_ID = fct_inorder(program_ID)) 

和劇情:

ggplot(df2, aes(x = program_ID, y = total_minutes_p, 
      group = youth_activity_rc, 
      fill = youth_activity_rc)) + 
    geom_col(position = position_stack(reverse = TRUE)) + 
    coord_flip() 

enter image description here

使用fct_relevelarrange ING設置要立足順序因子的水平作爲第一級。例如,如果你想通過total_minutes_p「創建產品」有序而不是「不集中」的圖表:

df2 = df %>% 
    ungroup() %>% 
    arrange(fct_relevel(youth_activity_rc, "Creating Product"), total_minutes_p) %>% 
    mutate(program_ID = fct_inorder(program_ID)) 

enter image description here

+0

任何直接的方式,如果不'Focused'沒有發生來做到這一點「youth_activity_rc」的第一個級別,而不是將其設置爲第一級? –

+1

不是我所知道的,但是如果您不想永久性地改變您的訂購因子的水平,可以在'arrange'內完成。請參閱編輯。 – aosmith

1

類似的方法來艾歐史密斯,但沒有使用forcats/dplyr的數據操縱。您可以在之內獲得的訂單,然後重構您的數據以按照該順序生成相應的級別。喜歡的東西:

levs <- df[which(df$youth_activity_rc == "Not Focused"), ] #Get the "Not focused" group 
order <- order(levs[,"total_minutes_p"]) #Order within your selected group 

df$program_ID_2 <- factor(df$program_ID, levels = levs[order, "program_ID"]) 

ggplot(df, aes(x = program_ID_2, y = total_minutes_p, 
       group = youth_activity_rc, 
       fill = youth_activity_rc)) + 
    geom_col(position = position_stack(reverse = TRUE)) + 
    coord_flip() 

enter image description here

注意我創建了一個新的變量,名爲program_ID_2但你不必

相關問題