我希望根據另一個變量(逗留)來計算溫度和溼度的2天和5天平均值。如果住院時間爲0(今日患者入院),則平均值應根據同一日期(第1天)和前一天(第2天)的值計算。同樣,對於停留1來說,平均值是從日期之前返回2和3的值計算出來的。對於所有保留8以上的值,2日平均值是從當前日期之前的第9日和第10日算起的。第0天停留的5天平均值取決於第1,2,3,4和第5天的值。下表顯示如何計算計算結果。根據另一列計算兩天和五天的平均值
需要輸出示例。
Date stay Temperature Humidity 2d temp 5d temp
9-Mar-98 6 4.23 74.32 na na
10-Mar-98 1 5.16 70.33 2.12 na
11-Mar-98 8 7.39 65.77 na na
14-Mar-98 3 6.63 66.46 6.27 3.35
23-Mar-98 2 11.03 62.94 11.13 13.97
24-Mar-98 10 10.87 57.35 10.09 8.78
4-Apr-98 0 9.64 59.21 8.68 9.51
5-Apr-98 5 9.70 88.30 16.14 13.81
一些解釋:3月11日入場時間爲8,平均值設爲NA,因爲沒有可用的值。 3月14日有3天,2天的平均值是根據3月10日和11日的數值計算出來的。另一方面,4月5日停留5天,2天平均值是根據3月30日和31日的數值計算的(計算從5在當前日期之前的幾天)。
下表顯示了用於平均值計算從當前日期的每個的時間段保持低於發現
stay 2d average 5d averages
0 1,2 1,2,3,4,5
1 2,3 2,3,4,5,6
2 3,4 3,4,5,6,7
3 4,5 4,5,6,7,8
4 5,6 5,6,7,8,9
5 6,7 6,7,8,9,10
6 7,8 7,8,9,10,11
7 8,9 8,9,10,11,12
8 9,10 9,10,11,12,13
樣本數據。
> dput(mydata)
structure(list(date = structure(c(10294, 10295, 10296, 10297,
10298, 10299, 10300, 10301, 10302, 10303, 10304, 10305, 10306,
10307, 10308, 10309, 10310, 10311, 10312, 10313, 10314, 10315,
10316, 10317, 10318, 10319, 10320, 10321, 10322, 10323, 10324
), class = "Date"), stay = c(6, 1, 8, 11, 27, 3, 4, 5, 11, 13,
2, 17, 26, 6, 2, 10, 5, 2, 11, 24, 8, 11, 2, 8, 7, 30, 0, 5,
1, 2, 2), temperature = c(4.23000001907349, 5.15541648864746,
7.38499999046326, 9.47041666507721, 7.61999988555908, 6.62625002861023,
8.71875, 11.4608333110809, 11.2570832967758, 14.5691666603088,
10.3120833337307, 11.1216666698456, 11.1420832872391, 11.241666674614,
11.03125, 10.8691666722298, 12.4862499237061, 13.9341666698456,
11.8995833396912, 12.3716666698456, 12.5091667175293, 16.3833332061768,
15.8945832252502, 7.26666665077209, 7.0091667175293, 7.73125004768372,
9.63833332061768, 9.7045833170414, 11.4941666126251, 11.1304166316986,
11.3908333778381), humid = c(74.3199996948242, 70.3308334350586,
65.7658309936523, 69.2799987792969, 83.1170806884766, 66.4599990844727,
67.4225006103516, 85.7504196166992, 89.9520797729492, 65.2566680908203,
43.3604164123535, 51.7508316040039, 54.6866683959961, 68.2958297729492,
62.9420852661133, 57.3504180908203, 66.4137496948242, 57.6333351135254,
78.9029159545898, 84.5666656494141, 84.2004165649414, 71.2779159545898,
74.0320816040039, 65.2512512207031, 58.8224983215332, 62.4949989318848,
59.2054176330566, 88.2983322143555, 71.2545852661133, 78.0783309936523,
51.9004173278809)), datalabel = "", time.stamp = " 3 Apr 2015 22:09", .Names = c("date",
"stay", "temperature", "humid"), formats = c("%dD_m_Y", "%9.0g",
"%9.0g", "%9.0g"), types = c(255L, 255L, 255L, 255L), val.labels = c("",
"", "", ""), var.labels = c(" ", "", "temp", "rh"), expansion.fields = list(
c("_dta", "_lang_list", "default"), c("_dta", "_lang_c",
"default")), row.names = c("1", "2", "3", "4", "5", "6",
"7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17",
"18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28",
"29", "30", "31"), version = 12L, class = "data.frame")
歡迎SO!請注意,該網站不是代碼分配器。要求爲您編寫代碼通常在這裏收到的不好。請抽出時間和問題以瞭解如何提出能夠產生有利答案的問題。 – 2015-04-03 20:51:13
我會使用dplyr窗口函數'lead' /'lag' http://cran.r-project.org/web/packages/dplyr/vignettes/window-functions.html爲最後的溫度添加額外的colums一天,第二天......並使用'ifelse'來區分例如'mydata < - mydata%<%mutate(temp_last_day = lag(temperature),temp_second_last_day = lag(temp_last_day,...)'等等。但肯定有更優雅的解決方案 – ckluss 2015-04-03 20:59:55
@ckluss我是R新手和示例的鏈接有點難以理解當我嘗試了你提供的代碼時,我得到了錯誤信息「找不到函數」%<%「」我錯過了什麼來運行代碼? – Harawe 2015-04-04 17:45:14