在這裏變換數據設定爲具有一排是用於個別的一些數據與id = 1
:
id time status
--------------
1 t status
t
是時間的一些事件,並status
是要麼如果發生事件則爲1
或如果沒有發生則爲0
(在這種情況下,t
是研究的持續時間)。
假設t
位於a2
和a3
之間。
我的目標是我的數據轉換成如下:
id period start stop status
---------------------------
1 1 0 a1 0
1 2 a1 a2 0
1 3 a2 t status
個人1的總時間分爲三個區間,其中存在(0, a1)
和任何情況下(a1, a2)
問題
你能想到一種有效的方法來編寫一個R函數,它輸入一個數據集和一個向量,並輸出轉換後的數據集?
編輯
第1部分 我一直在問一個具體的例子。這裏是一個:
id time status
--------------
1 5 1
和a1=1
,a2=3
,a3=7
。
第2部分我也被要求展示我的嘗試。這是
> data <- data.frame(id=1, time=5, status=1)
> a <- c(1, 3, 7)
> N <- nrow(data)
> data$period <- ifelse(data$time < a[1], 1,
+ ifelse(data$time < a[2], 2,
+ ifelse(data$time < a[3], 3, 4)))
>
>
> dataTemp1 <- data.frame(matrix(nrow=N, ncol=ncol(data)))
> names(dataTemp1) <- names(data)
> dataTemp2 <- data.frame(matrix(nrow=N, ncol=ncol(data)))
> names(dataTemp2) <- names(data)
> dataTemp3 <- data.frame(matrix(nrow=N, ncol=ncol(data)))
> names(dataTemp3) <- names(data)
> dataTemp4 <- data.frame(matrix(nrow=N, ncol=ncol(data)))
> names(dataTemp4) <- names(data)
>
> for(j in 1:N)
+ {
+ if(data[j, "period"] == 1){
+ data[j, "start"] <- 0
+ data[j, "stop"] <- data[j, "time"]
+ } else if(data[j, "period"] == 2){
+ dataTemp1[j, c("id", "time", "period")] <-
+ data[j, c("id", "time", "period")]
+ dataTemp1[j, "start"] <- 0
+ dataTemp1[j, "stop"] <- a[1]
+ dataTemp1[j, "status"] <- 0
+
+ data[j, "start"] <- a[1]
+ data[j, "stop"] <- data[j, "time"]
+ } else if(data[j, "period"] == 3){
+ dataTemp1[j, c("id", "time", "period")] <-
+ data[j, c("id", "time", "period")]
+ dataTemp1[j, "start"] <- 0
+ dataTemp1[j, "stop"] <- a[1]
+ dataTemp1[j, "status"] <- 0
+
+ dataTemp2[j, c("id", "time", "period")] <-
+ data[j, c("id", "time", "period")]
+ dataTemp2[j, "start"] <- a[1]
+ dataTemp2[j, "stop"] <- a[2]
+ dataTemp2[j, "status"] <- 0
+
+ data[j, "start"] <- a[2]
+ data[j, "stop"] <- data[j, "time"]
+ } else if(data[j, "period"] == 4){
+ dataTemp1[j, c("id", "time", "period")] <-
+ data[j, c("id", "time", "period")]
+ dataTemp1[j, "start"] <- 0
+ dataTemp1[j, "stop"] <- a[1]
+ dataTemp1[j, "status"] <- 0
+
+ dataTemp2[j, c("id", "time", "period")] <-
+ data[j, c("id", "time", "period")]
+ dataTemp2[j, "start"] <- a[1]
+ dataTemp2[j, "stop"] <- a[2]
+ dataTemp2[j, "status"] <- 0
+
+ dataTemp3[j, c("id", "time", "period")] <-
+ data[j, c("id", "time", "period")]
+ dataTemp3[j, "start"] <- a[2]
+ dataTemp3[j, "stop"] <- a[3]
+ dataTemp3[j, "status"] <- 0
+
+ data[j, "start"] <- a[3]
+ data[j, "stop"] <- data[j, "time"]
+ }
+ }
>
> dataTemp1 <- dataTemp1[complete.cases(dataTemp1), ]
> dataTemp2 <- dataTemp2[complete.cases(dataTemp2), ]
> dataTemp3 <- dataTemp3[complete.cases(dataTemp3), ]
> dataTemp4 <- dataTemp4[complete.cases(dataTemp4), ]
>
> data <- rbind(data, dataTemp1, dataTemp2, dataTemp3, dataTemp4)
> data[, "period"] <- ifelse(data[, "start"] == 0, 1,
+ ifelse(data[, "start"] == a[1], 2,
+ ifelse(data[, "start"] == a[2], 3,
+ ifelse(data[, "start"] == a[3], 4,
+ 5))))
> data <- data[order(data$id, data$start),
+ c("id", "period", "start", "stop", "status")]
> data
id period start stop status
2 1 1 0 1 0
3 1 2 1 3 0
1 1 3 3 5 1
你應該提供一個可重複的例子。 ai是什麼日期?爲什麼不提供一些數值,而不僅僅是符號,還可以顯示你嘗試過什麼? – agstudy
@agstudy:我做了編輯。但是,我想要一個功能而不是一個只適用於一個例子的程序。 – user7064
@Arun:Wahou,thx!如果你讓它成爲答案,我會接受它! – user7064