2017-07-28 26 views
0

我想將某個列上的數據組分組,然後將函數應用於返回多個列的分組數據。藉助於舉例,考慮以下我可以通過變異從一個組創建多個列嗎?

Names = append(rep('Mark',10),rep('Joe',10)) 
Spend = rnorm(length(Names),50,0.5) 

df <- data.frame(
    Names, 
    Spend 
) 


get.mm <- function(data){ 


    return(list(median(data),mean(data))) 
} 

這裏,get.mm返回兩個數字的列表。我想申請get.mmdf %>% group_by(Names)並且結果有兩列,每個輸出一個函數。

期望的結果應該是

Names median mean 
    <fctr> <dbl> <dbl> 
1 Joe 49.89284 49.9504 
2 Mark 50.17244 50.0735 

我在這裏簡單的功能演示的方式,我知道我可能只是不喜歡

df %>% group_by(Names) %>% summarise(median = median(Spend), mean = mean(Spend)) 
+1

見'summarise_at()'和https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html –

+0

本博客文章是非常相關:https://www.r-bloggers.com/programming-with-dplyr-by-using-dplyr/ –

回答

1

東西,如果你重寫get.mm使其返回數據幀,則可以使用group_by %>% do

get.mm <- function(data){ 
    data.frame(median = median(data), mean = mean(data)) 
} 

df %>% group_by(Names) %>% do(get.mm(.$Spend)) 
# here . stands for a sub data frame with a unique Name, .$Spend passes the corresponding 
# column to the function 

重複的例子:

set.seed(1) 
Names = append(rep('Mark',10),rep('Joe',10)) 
Spend = rnorm(length(Names),50,0.5) 
df <- data.frame(Names, Spend) 

df %>% group_by(Names) %>% do(get.mm(.$Spend)) 

# A tibble: 2 x 3 
# Groups: Names [2] 
# Names median  mean 
# <fctr> <dbl> <dbl> 
#1 Joe 50.24594 50.12442 
#2 Mark 50.12829 50.06610 

df %>% group_by(Names) %>% summarise(median = median(Spend), mean = mean(Spend)) 

# A tibble: 2 x 3 
# Names median  mean 
# <fctr> <dbl> <dbl> 
#1 Joe 50.24594 50.12442 
#2 Mark 50.12829 50.06610 
+0

我不是一個dplyr呃很多時候,但是就像'df%>%group_by(Names )%>%summarise_all(funs(mean,median))'可以接受嗎? – thelatemail

+0

@thelatemail這絕對有效,是一個很好的選擇。當由OP指示的情況更復雜時,'group_by%>%do'適用。 – Psidom

相關問題