我有一個數據集,看起來像這樣:集團行多達當前行中的R data.table
library(data.table)
set.seed(10)
n_rows <- 50
data <- data.table(id = 1:n_rows,
timestamp = Sys.Date() + as.difftime(1:n_rows, units = "days"),
subject = sample(letters[1:4], n_rows, replace = T),
response = sample(3, n_rows, replace = T)
)
head(data, 10)
id timestamp subject response
1: 1 2016-05-17 c 2
2: 2 2016-05-18 b 3
3: 3 2016-05-19 b 1
4: 4 2016-05-20 c 2
5: 5 2016-05-21 a 1
6: 6 2016-05-22 a 2
7: 7 2016-05-23 b 2
8: 8 2016-05-24 b 2
9: 9 2016-05-25 c 2
10: 10 2016-05-26 b 2
我需要通過操作做一些組按主題迄今爲止每個響應的那筆出現次數。
下面的組通過產生nth_test列。
new_vars <- data[, .(id, timestamp, nth_test = 1:.N, response), by=.(subject)]
subject id timestamp nth_test response
1: c 1 2016-05-17 1 2
2: c 4 2016-05-20 2 2
3: c 9 2016-05-25 3 2
4: c 11 2016-05-27 4 1
5: c 12 2016-05-28 5 1
6: c 14 2016-05-30 6 2
7: c 22 2016-06-07 7 2
8: c 26 2016-06-11 8 2
9: c 31 2016-06-16 9 3
10: c 36 2016-06-21 10 1
但我不知道如何生產列resp_1,resp_2 & resp_3像下面。
subject id timestamp nth_test response resp_1 resp_2 resp_3
1: c 1 2016-05-17 1 2 0 1 0
2: c 4 2016-05-20 2 2 0 2 0
3: c 9 2016-05-25 3 2 0 3 0
4: c 11 2016-05-27 4 1 1 3 0
5: c 12 2016-05-28 5 1 2 3 0
6: c 14 2016-05-30 6 2 2 4 0
7: c 22 2016-06-07 7 2 2 5 0
8: c 26 2016-06-11 8 2 2 6 0
9: c 31 2016-06-16 9 3 2 6 1
10: c 36 2016-06-21 10 1 3 6 1
乾杯
您的數據是如何排序的,因爲這些列值取決於您的數據的順序?你可以做一些類似'resp_i:= cumsum(response == i)' – Psidom
Psidom這正是我需要的,謝謝。 – efbbrown