2017-09-24 86 views
2

我在data.frame中有兩列,應該有按相同順序排序的級別,但我不知道如何以直接的方式執行。從一列到另一列的複製因子級別順序

這裏的情況:

library(ggplot2) 
library(dplyr) 
library(magrittr) 
set.seed(1) 
df1 <- data.frame(rating = sample(c("GOOD","BAD","AVERAGE"),10,T), 
        div = sample(c("A","B","C"),10,T), 
        n = sample(100,10,T)) 

# I'm adding a label column that I use for plotting purposes 
df1 <- df1 %>% group_by(rating) %>% mutate(label = paste0(rating," (",sum(n),")")) %>% ungroup 
# # A tibble: 10 x 4 
#  rating div  n   label 
#  <fctr> <fctr> <int>   <chr> 
# 1  BAD  C 48  BAD (220) 
# 2  BAD  B 87  BAD (220) 
# 3  BAD  C 44  BAD (220) 
# 4 GOOD  B 25  GOOD (77) 
# 5 AVERAGE  B  8 AVERAGE (117) 
# 6 AVERAGE  C 10 AVERAGE (117) 
# 7 AVERAGE  A 32 AVERAGE (117) 
# 8 GOOD  B 52  GOOD (77) 
# 9 AVERAGE  C 67 AVERAGE (117) 
# 10  BAD  C 41  BAD (220) 

# rating levels are sorted 
df1$rating <- factor(df1$rating,c("BAD","AVERAGE","GOOD")) 

ggplot(df1,aes(x=rating,y=n,fill=div)) + geom_col() # plots in the order I want 
ggplot(df1,aes(x=label,y=n,fill=div)) + geom_col() # doesn't because levels aren't sorted 

如何管理的因素順序從一列複製到另一個? 我可以把它以這種方式工作,但我認爲這是真正的尷尬:

lvls <- df1 %>% select(rating,label) %>% unique %>% arrange(rating) %>% extract2("label") 
df1$label <- factor(df1$label,lvls) 
ggplot(df1,aes(x=label,y=n,fill=div)) + geom_col() 

回答

3

一旦你設置的rating的水平,你可以使用forcats設置的水平由rating這樣的順序...

library(forcats) 
df1 <- df1 %>% group_by(rating) %>% 
       mutate(label=paste0(rating," (",sum(n),")")) %>% 
       ungroup %>% 
       arrange(rating) %>%    #sort by rating 
       mutate(label=fct_inorder(label)) #set levels by order in which they appear 

或者您可以使用forcats::fct_reorder做同樣的事情...

df1$label <- fct_reorder(df1$label, as.numeric(df1$rating)) 

情節,然後在正確的順序吧。

+1

謝謝你,這個包很有趣,探索它我發現函數'fct_reorder'允許一個不太詳細的方法:'df1 $ label < - fct_reorder(df1 $ label,as.numeric(df1 $ rating))''。也許你可以將它添加到你的答案? –

+0

謝謝 - 會做! –

3

而不是添加一個標籤欄和使用aes(x = label,你可能會粘aes(x = rating,並創建labelsscale_x_discrete

ggplot(df1, aes(x = rating, y = n, fill = div)) + 
    geom_col() + 
    scale_x_discrete(labels = df1 %>% 
        group_by(rating) %>% 
        summarize(n = sum(n)) %>% 
        mutate(lab = paste0(rating, " (", n, ")")) %>% 
        pull(lab)) 

enter image description here

相關問題