R：評估多個條件語句多次

我有這樣的數據：R：評估多個條件語句多次

df = as.data.frame(cbind(
    event1 = c(88.76,96.04,99.60,88.76,99.60,34.04,96.04,87.03,87.44,87.44), 
    time1 = c(0.100,0.033,0.000,0.117,0.000,0.000,0.050,0.500,0.133,0.117), 
    event2 = c(NA,99.60,NA,34.04,99.62,88.76,87.44,87.41,88.76,88.76), 
    time2 = c(NA,0.050,NA,0.100,0.017,0.083,0.200,0.500,0.133,0.050), 
    event100 = c(NA,89.52,NA,34.04,93.93,34.02,88.76,88.01,88.01,87.41), 
    time100 = c(NA,0.050,NA,0.100,0.033,0.117,0.300,0.500,0.233,0.300), 
    event_88.76_within_0.1 = rep(0,10) 
))

其中event1是第一個事件的對象已經和time1是花了多長時間event1發生之前的代碼，每個主題都有多達100個事件和事件時間。

我正在嘗試創建一個變量（event_88.76_within_0.1），它表示事件88.76是否在0.1分鐘內發生。所以如果任何一個主題的事件等於88.76並且事件的相應時間小於或等於0.1，它就等於1。

使用這種嵌套for循環：

for(r in 1:nrow(df)){ #for each subject 
    for(c in 1:6){ #for each event 
    if(!is.na(df[r, c]) & df[r, c] == 88.76 & df[r,(c+1)] <= 0.1){ 
#if the event code is not missing and if it's the needed event code and 
#the next column over (the corresponding time to event) is less than 0.1 
     df[r,"event_88.76_within_0.1"] = 1 
    } 
    i = i + 2 #skip 2 columns to get to next event code 
    } 
}

我能得到這個，這是我想要的：

 event1 time1 event2 time2 event100 time100 event_88.76_within_0.1 
[1,] 88.76 0.100  NA NA  NA  NA      1 
[2,] 96.04 0.033 99.60 0.050 89.52 0.050      0 
[3,] 99.60 0.000  NA NA  NA  NA      0 
[4,] 88.76 0.117 34.04 0.100 34.04 0.100      0 
[5,] 99.60 0.000 99.62 0.017 93.93 0.033      0 
[6,] 34.04 0.000 88.76 0.083 34.02 0.117      1 
[7,] 96.04 0.050 87.44 0.200 88.76 0.300      0 
[8,] 87.03 0.500 87.41 0.500 88.01 0.500      0 
[9,] 87.44 0.133 88.76 0.133 88.01 0.233      0 
[10,] 87.44 0.117 88.76 0.050 87.41 0.300      1

但數據集有上千個科目（每100可能發生的事件），所以嵌套的for循環需要一段時間才能運行。

我想上述循環矢量化的東西是這樣的：

df$event_88.76_within_0.1 = 0 
df$event_88.76_within_0.1[df[ "events that equal 88.76 and occurred within 0.1" ]]=1

但我沒有任何運氣。

任何幫助將不勝感激。

來源

2017-06-06 JRF1111

你可以這樣來做：

## Define the names of your events and times columns 
events = paste0("event",c(1,2,100)) 
times = paste0("time",c(1,2,100)) 
## Check if your two conditions are met and multiply the results (multiplying TRUE by TRUE gives 1, multiplying TRUE or FALSE by FALSE returns 0) 
df$event_88.76_within_0.1 = pmin(1,rowSums((df[,events]==88.76)*(df[,times]<=0.1),na.rm=T)) 

    event1 time1 event2 time2 event100 time100 event_88.76_within_0.1 
1 88.76 0.100  NA NA  NA  NA      1 
2 96.04 0.033 99.60 0.050 89.52 0.050      0 
3 99.60 0.000  NA NA  NA  NA      0 
4 88.76 0.117 34.04 0.100 34.04 0.100      0 
5 99.60 0.000 99.62 0.017 93.93 0.033      0 
6 34.04 0.000 88.76 0.083 34.02 0.117      1 
7 96.04 0.050 87.44 0.200 88.76 0.300      0 
8 87.03 0.500 87.41 0.500 88.01 0.500      0 
9 87.44 0.133 88.76 0.133 88.01 0.233      0 
10 87.44 0.117 88.76 0.050 87.41 0.300      1

來源

2017-06-06 23:34:44 Lamia

美麗！謝謝。 – JRF1111

Bah！你打敗了我;） –

@Lamia，很好的回答！一個小建議：如果條件每行保持n次，你會得到'n'而不是'1'。我建議在它周圍或者最後一個變量周圍包裝一個'ifelse'，就像'ifelse（rowSums（（df [，events] == 88.76）*（df [，times] <= 0.1），na.rm = T ）> 0,1,0）' –

這個怎麼樣球膠帶的...

cond1 <- df[,seq(1,6,by=2)]==88.76 
cond2 <- df[,seq(2,6,by=2)]<=0.1 
vec <- which(rowSums(cond1 & cond2, na.rm=T)==1) 

df[vec,] 
## event1 time1 event2 time2 event100 time100 
## 1 88.76 0.100  NA NA  NA  NA 
## 6 34.04 0.000 88.76 0.083 34.02 0.117 
## 10 87.44 0.117 88.76 0.050 87.41 0.300

來源

2017-06-06 23:37:07

也是一個很好的答案（我喜歡通過使用列號而不是@ Lamia的命名方法來使它更靈活一些），但Lamia打了一分鐘。 – JRF1111

R：評估多個條件語句多次

回答

相關問題