2017-07-31 75 views
0
"f","index","values","lo.80","lo.95","hi.80","hi.95" 

"auto.arima",2017-07-31 16:40:00,2.81613884762163,NA,NA,NA,NA 

"auto.arima",2017-07-31 16:40:10,2.83441637197378,NA,NA,NA,NA 

"auto.arima",2017-07-31 20:39:10,3.18497899649267,2.73259824384436,2.49312233904087,3.63735974914098,3.87683565394447 

"auto.arima",2017-07-31 20:39:20,3.16981166809297,2.69309866988864,2.44074205235297,3.64652466629731,3.89888128383297 

"ets",2017-07-31 16:40:00,2.93983529828936,NA,NA,NA,NA 

"ets",2017-07-31 16:40:10,3.09739640066054,NA,NA,NA,NA 

"ets",2017-07-31 20:39:10,3.1951571771414,2.80966705285567,2.60560090776504,3.58064730142714,3.78471344651776 

"ets",2017-07-31 20:39:20,3.33876776870274,2.93593322313957,2.72268549604222,3.7416023142659,3.95485004136325 

"bats",2017-07-31 16:40:00,2.82795253090081,NA,NA,NA,NA 

"bats",2017-07-31 16:40:10,2.96389759682623,NA,NA,NA,NA 

"bats",2017-07-31 20:39:10,3.1383560278272,2.76890864400062,2.573335012715,3.50780341165378,3.7033770429394 

"bats",2017-07-31 20:39:20,3.3561357998535,2.98646195085452,2.79076843614824,3.72580964885248,3.92150316355876 

我有一個類似上面的數據框,其列名爲:「f」,「index」,「values」,「lo.80」,「lo.95」,「hi 0.80" , 「hi.95」。計算R dataframe中的加權平均值

我想要做的是計算來自特定時間戳的不同模型的預測結果的加權平均值。通過這我的意思是

對於auto.arima每一行有在ETS和蝙蝠相同的時間戳值對應的行,所以加權平均數來計算是這樣的:

value_arima * 1/3 + values_ets * 1/3 + values_bats * 1/3;應計算lo.80和其他列的相似值。

這個結果應該存儲在一個新的數據框中,並加上所有的加權平均值。

新的數據幀可以是這個樣子:

index(timesamp from above dataframe),avg,avg_lo_80,avg_lo_95,avg_hi_80,avg_hi_95 

我想我需要使用傳播()和變異()函數來實現這一目標。對R來說是新的,我無法在形成這個數據框後繼續。

請幫忙。

+0

在學習期間我將暫時刪除這篇文章,並得到按順序排列格式,否則你很可能會收到很多downvotes。 – snoram

+0

@snoram,好嗎? – Ashag

+0

更好但不好。我認爲更好地使用你的數據的一個子集的dput ...看到這個:https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – snoram

回答

1

您提供的示例不是加權平均值,而是簡單的平均值。 你想要的是一個簡單的聚合。 第一部分是由dput(更好地分享這裏)提供的數據集

d <- structure(list(f = structure(c(1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 
2L, 2L, 2L, 2L), .Label = c("auto.arima", "bats", "ets"), class = "factor"), 
index = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 
3L, 4L), .Label = c("2017-07-31 16:40:00", "2017-07-31 16:40:10", 
"2017-07-31 20:39:10", "2017-07-31 20:39:20"), class = "factor"), 
values = c(2.81613884762163, 2.83441637197378, 3.18497899649267, 
3.16981166809297, 2.93983529828936, 3.09739640066054, 3.1951571771414, 
3.33876776870274, 2.82795253090081, 2.96389759682623, 3.1383560278272, 
3.3561357998535), lo.80 = c(NA, NA, 2.73259824384436, 2.69309866988864, 
NA, NA, 2.80966705285567, 2.93593322313957, NA, NA, 2.76890864400062, 
2.98646195085452), lo.95 = c(NA, NA, 2.49312233904087, 2.44074205235297, 
NA, NA, 2.60560090776504, 2.72268549604222, NA, NA, 2.573335012715, 
2.79076843614824), hi.80 = c(NA, NA, 3.63735974914098, 3.64652466629731, 
NA, NA, 3.58064730142714, 3.7416023142659, NA, NA, 3.50780341165378, 
3.72580964885248), hi.95 = c(NA, NA, 3.87683565394447, 3.89888128383297, 
NA, NA, 3.78471344651776, 3.95485004136325, NA, NA, 3.7033770429394, 
3.92150316355876)), .Names = c("f", "index", "values", "lo.80", 
"lo.95", "hi.80", "hi.95"), class = "data.frame", row.names = c(NA, 
-12L)) 

> aggregate(d[,3:7], by = d["index"], FUN = mean) 
       index values lo.80 lo.95 hi.80 hi.95 
1 2017-07-31 16:40:00 2.861309  NA  NA  NA  NA 
2 2017-07-31 16:40:10 2.965237  NA  NA  NA  NA 
3 2017-07-31 20:39:10 3.172831 2.770391 2.557353 3.575270 3.788309 
4 2017-07-31 20:39:20 3.288238 2.871831 2.651399 3.704646 3.925078 

,你想,你可以保存在一個對象這個輸出和改變的列名。

如果你真的想要一個加權平均值,這是一種方式來獲得它(這裏蝙蝠具有0.8的權重和2個其他0.1):

> d$weight <- (d$f) 
> levels(d$weight) # check the levels 
[1] "auto.arima" "bats"  "ets"  
> levels(d$weight) <- c(0.1, 0.8, 0.1) 
> # transform the factor into numbers 
> # warning as.numeric(d$weight) is not correct !! 
> d$weight <- as.numeric(as.character((d$weight))) 
> 
> # Here the result is saved in a data.frame called "result 
> result <- aggregate(d[,3:7] * d$weight, by = d["index"], FUN = sum) 
> result 
       index values lo.80 lo.95 hi.80 hi.95 
1 2017-07-31 16:40:00 2.837959  NA  NA  NA  NA 
2 2017-07-31 16:40:10 2.964299  NA  NA  NA  NA 
3 2017-07-31 20:39:10 3.148698 2.769353 2.568540 3.528043 3.728857 
4 2017-07-31 20:39:20 3.335767 2.952073 2.748958 3.719460 3.922576 
+0

讓我們[繼續在聊天討論](http://chat.stackoverflow.com/rooms/150747/discussion-between-ashag-and-gilles)。 – Ashag