2017-09-11 50 views
0

考慮以下幾點:清潔的方式來重新排序巢purrr地圖後tidyr傳播

library(tidyverse) 
library(broom) 

tidy.quants <- mtcars %>% 
    nest(-cyl) %>% 
    mutate(quantiles = map(data, ~ quantile(.$mpg))) %>% 
    unnest(map(quantiles, tidy)) 

tidy.quants 
#> # A tibble: 15 x 3 
#>  cyl names  x 
#> <dbl> <chr> <dbl> 
#> 1  6 0% 17.80 
#> 2  6 25% 18.65 
#> 3  6 50% 19.70 
#> 4  6 75% 21.00 
#> 5  6 100% 21.40 
#> 6  4 0% 21.40 
#> 7  4 25% 22.80 
#> 8  4 50% 26.00 
#> 9  4 75% 30.40 
#> 10  4 100% 33.90 
#> 11  8 0% 10.40 
#> 12  8 25% 14.40 
#> 13  8 50% 15.20 
#> 14  8 75% 16.25 
#> 15  8 100% 19.20 

不過是大而整齊,當試圖傳播(或傳遞到圖),該names列回報一個(有點)意外順序:

tidy.quants %>% spread(names, x) 
#> # A tibble: 3 x 6 
#>  cyl `0%` `100%` `25%` `50%` `75%` 
#> * <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 
#> 1  4 21.4 33.9 22.80 26.0 30.40 
#> 2  6 17.8 21.4 18.65 19.7 21.00 
#> 3  8 10.4 19.2 14.40 15.2 16.25 

ggplot(tidy.quants, aes(x = names, y = x, color = factor(cyl))) + 
    geom_point() 

enter image description here

是否有有一個乾淨的/慣用的方式有names按預期順序返回?那就是,0%, 25%, 50%, 75%, 100%而不是0%, 100%, 25%, 50%, 75%

回答

1

你可以試試gtools::mixedsort,它可以對嵌入數字的字符串進行排序;得到有序標籤與mixedsort(unique(names)),類似color後,可以使names(x軸變量)排序值水平的因素,ggplot應該然後能夠以正確的順序顯示X軸標籤:

library(gtools) 
ggplot(tidy.quants, aes(x = factor(names, levels = mixedsort(unique(names))), y = x, color = factor(cyl))) + 
    geom_point() + xlab('names') 

enter image description here


spread

類似的想法:

tidy.quants %>% 
    mutate(names = factor(names, mixedsort(unique(names)))) %>% 
    spread(names, x) 

# A tibble: 3 x 6 
# cyl `0%` `25%` `50%` `75%` `100%` 
#* <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 
#1  4 21.4 22.80 26.0 30.40 33.9 
#2  6 17.8 18.65 19.7 21.00 21.4 
#3  8 10.4 14.40 15.2 16.25 19.2 
1

這工作,因爲names已經由quantiles排序:

tidy.quants <- mtcars %>% 
    nest(-cyl) %>% 
    mutate(quantiles = map(data, ~ quantile(.$mpg))) %>% 
    unnest(map(quantiles, tidy)) %>% 
    mutate(names=factor(names,unique(names))) 

tidy.quants %>% spread(names, x) 

結果

# A tibble: 3 x 6 
    cyl `0%` `25%` `50%` `75%` `100%` 
* <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 
1  4 21.4 22.80 26.0 30.40 33.9 
2  6 17.8 18.65 19.7 21.00 21.4 
3  8 10.4 14.40 15.2 16.25 19.2