聚合函數

我有一個包含氣候數據，不同季節的樣品在data.set：聚合函數

df <- data.frame(season=rep(1:5,2),year=rep(1:2,each=5), 
     temp=c(2,4,3,5,2,4,1,5,4,3),ppt=c(4,3,1,5,6,2,1,2,2,2), 
     samples=c(22,25,24,31,31,29,28,31,30,32))

我能確定我的氣候變量的平均值爲每個季節，每年只需爲：

aggregate(df[,c('temp','ppt')], by = list(df$season,df$year), function(x) mean(x,na.rm=T))

但是，我想，以確定每個賽季加權平均 |使用可變samples我的權重組合一年。

本質上，我想用weighted.mean替換aggregate()中的mean函數。這將需要我的功能添加第二個參數，需要改變我的x。

function(x,w) weighted.mean(x,w,na.rm=T))

雖然，我不知道如何讓重量說法（「W」）weighted.mean()與聚合數據的每個子集有所不同。

我可以在aggregate函數中做到這一切嗎？

任何建議將是偉大的！

來源

2016-01-25 theforestecologist

從dplyr嘗試summarise_each。它允許與group_by和應用在現有分組到多個列：

library(dplyr) 
df %>% group_by(season, year) %>% 
     summarise_each(funs(weighted.mean(., samples,na.rm=T)), temp,ppt) 
# Source: local data frame [10 x 5] 
# Groups: season, year [10] 
# 
# season year temp ppt samples 
# (int) (int) (dbl) (dbl) (dbl) 
# 1  1  1  2  4  22 
# 2  2  1  4  3  25 
# 3  3  1  3  1  24 
# 4  4  1  5  5  31 
# 5  5  1  2  6  31 
# 6  1  2  4  2  29 
# 7  2  2  1  1  28 
# 8  3  2  5  2  31 
# 9  4  2  4  2  30 
# 10  5  2  3  2  32

來源

2016-01-25 20:28:24

可以這樣做使用'aggregate'或在R的基本包的任何其它功能？ – theforestecologist

我不知道爲什麼你需要一個複雜的基礎解決方案時，這個工程，但在這裏你去。解釋需要很長的時間才能通過'df [，c（「temp」，「ppt」）] < - matrix（ncol = 2，unlist（do.call（rbind，lapply（split（df，list（df $ （df），函數（df），函數（df [，c（「temp」，「ppt」）]，函數（cols）weighted.mean（cols，df $ samples，na.rm = T ））}））））' –

回答

相關問題