2015-03-31 55 views
0

我有一個數據集,我想同時採取正常'平均'和加權平均值組。每組可以被認爲是不同的投資組合或股票,價格是該組合或股票的價格,規模是股票,收益是百分比回報,因此市值將是價格*規模。分組加權平均在r(最好與ddply,但無論作品)

加權平均數將是相對於集團市場份額的收益。我運行下面的代碼,它看起來明顯錯誤的,但我的生活我想不出什麼我失蹤:

mydf= structure(list(group = structure(c(1L, 2L, 1L, 2L, 1L), .Label = c("a","b"), class = "factor"), 
        price = c(15, 20, 10, 40, 20), size = c(100, 10, 50, 50, 1000), 
        gain = c(0.03, 0.02, 0.05, 0.1, 0.01), wt = c(1500, 200, 500, 2000, 20000)), 
       .Names = c("group", "price", "size", "gain", "wt"), row.names = c(NA, -5L), 
       class = "data.frame") 
mydf 
library(plyr) 
ddply(x, .(group), summarise,normal_mean= mean(gain), 
     wt_mean= weighted.mean(x$price*x$size,x$gain)) 

這裏我乘的價格和大小在一起,你也可以只在使用重量列weighted.mean函數或假定...

加權平均似乎不是我的團隊計算,但跨越所有行。任何幫助?

回答

2

data.table

library(data.table) 

setDT(mydf)[,list(normalMean=mean(gain), 
      weightedMean=weighted.mean(gain, wt/sum(wt))), 
      by = group] 

# group normalMean weightedMean 
#1:  a  0.03 0.01227273 
#2:  b  0.06 0.09272727 
1

的一種方法一種方法與dplyr

mydf %>% group_by(group) %>% 
summarise (mean=mean(gain), avgwt = weighted.mean(gain,wt)) 


    group mean  avgwt 
1  a 0.03 0.01227273 
2  b 0.06 0.09272727