0
如何創建一個依賴於其他變量的所有先前值的虛擬物,其中以前值的數目是任意的?如何創建一個依賴於另一個變量的任意數值前一個值的啞元?
我的數據如下所示:
library(data.table)
dt <- data.table(from = as.Date(c("20020101", "20030101", "20040101", "20050101",
"20010101", "20020101", "20030101", "20040101", "20050101"), "%Y%m%d"),
to = as.Date(c("20031231", "20041231", "20051231", "20061231",
"20021231", "20031231", "20041231", "20051231", "20061231"), "%Y%m%d"),
id = as.factor(c(1, 1, 1, 1, 2, 2, 2, 2, 2)),
cond = c(F, F, T, F, F, T, T, T, F))
> dt
from to id cond
1: 2002-01-01 2003-12-31 1 FALSE
2: 2003-01-01 2004-12-31 1 FALSE
3: 2004-01-01 2005-12-31 1 TRUE
4: 2005-01-01 2006-12-31 1 FALSE
5: 2001-01-01 2002-12-31 2 FALSE
6: 2002-01-01 2003-12-31 2 TRUE
7: 2003-01-01 2004-12-31 2 TRUE
8: 2004-01-01 2005-12-31 2 TRUE
9: 2005-01-01 2006-12-31 2 FALSE
我需要做的就是創建一個虛擬的地方爲dum = 1
如果cond == TRUE
任何s <= t
和dum = 0
如果cond == FALSE
0所有s <= t
。
from to id cond dum
1: 2002-01-01 2003-12-31 1 FALSE 0
2: 2003-01-01 2004-12-31 1 FALSE 0
3: 2004-01-01 2005-12-31 1 TRUE 1
4: 2005-01-01 2006-12-31 1 FALSE 1
5: 2001-01-01 2002-12-31 2 FALSE 0
6: 2002-01-01 2003-12-31 2 TRUE 1
7: 2003-01-01 2004-12-31 2 TRUE 1
8: 2004-01-01 2005-12-31 2 TRUE 1
9: 2005-01-01 2006-12-31 2 FALSE 1
我試圖用滯後的工作,即創建N
滯後每個id
其中N
是時間`數我是活着的,但是,由於個人沒有生命固定數量的週期這種方法太亂了。
這是我試圖開發用於當所有i
小號均存活對於相同量的週期(即,所有i
小號均存活4個週期)
dt <- dt[1:8, ]
dum <- c()
# Iterate through all unique IDs
for(i in unique(dt$id)){
# Subset the data
dt.tmp <- dt[id == i, ]
N <- nrow(dt.tmp)-1
nm <- paste("lag.cond", 1:N, sep = "")
# iterate through all periods and lag cond
for(j in 1:N){
dt.tmp[, (nm[j]) := shift(.SD, n = j), by = id, .SDcols = "cond"]
}
# If any of the lags are == TRUE => set dum to 1
dt.tmp[, dum := ifelse(cond | lag.cond1 | lag.cond2 | lag.cond3, 1, 0)]
dt.tmp[is.na(dum), dum := 0]
dum <- append(dum, dt.tmp$dum)
}
dt[, dum := dum]
dt
'cummax' integerifies for you,try'cummax(c(FALSE,TRUE,FALSE))' – Frank
@Frank integerify,這是一個了不起的單詞。謝謝,我會在我的回答中添加一個註釋。 – lmo