2017-05-03 16 views
0

每當列值等於1時,我正在嘗試爲每一行替換/更新結果列值由下一個相應的返回值。例如:使用特定標準從後續行中檢索/替換列值

*重複的例子:

set.seed(123) 
df<-data.frame(return=sample(runif(10, min = 0, max = 1)),day=seq(5, 1, by=-1), result =0) 
df 

*預期輸出:

index  return value result 
1 0.4566147  5  0.2875775 
2 0.9404673  4  0.2875775 
3 0.0455565  3  0.2875775 
4 0.5514350  2  0.2875775 
5 0.2875775  1  0.2875775 
6 0.5281055  5  0.7883051 
7 0.8924190  4  0.7883051 
8 0.8830174  3  0.7883051 
9 0.4089769  2  0.7883051 
10 0.7883051  1  0.7883051 

你的幫助是非常讚賞。

回答

0

例如,使用dplyr

library(dplyr) 
df %>% 
    mutate(group = cumsum(lag(day, default = 0) == 1)) %>% 
    group_by(group) %>% 
    mutate(result = return[day == 1]) %>% 
    ungroup() 

# # A tibble: 10 × 4 
#  return day result group 
#  <dbl> <dbl>  <dbl> <int> 
# 1 0.4566147  5 0.2875775  0 
# 2 0.9404673  4 0.2875775  0 
# 3 0.0455565  3 0.2875775  0 
# 4 0.5514350  2 0.2875775  0 
# 5 0.2875775  1 0.2875775  0 
# 6 0.5281055  5 0.7883051  1 
# 7 0.8924190  4 0.7883051  1 
# 8 0.8830174  3 0.7883051  1 
# 9 0.4089769  2 0.7883051  1 
# 10 0.7883051  1 0.7883051  1 
0

解data.frame:

df$result <- df[df$day == 1, "return"][cumsum(lag(df$day, default = 0) == 1) + 1] 
df 
     return day result 
1 0.4566147 5 0.2875775 
2 0.9404673 4 0.2875775 
3 0.0455565 3 0.2875775 
4 0.5514350 2 0.2875775 
5 0.2875775 1 0.2875775 
6 0.5281055 5 0.7883051 
7 0.8924190 4 0.7883051 
8 0.8830174 3 0.7883051 
9 0.4089769 2 0.7883051 
10 0.7883051 1 0.7883051 
0

的數據表的方法,

library(data.table) 

setDT(df)[, result := return[day == 1], by = (grp =cumsum(c(1, diff(day != 1) == 1)))][] 

#  return day result 
# 1: 0.4566147 5 0.2875775 
# 2: 0.9404673 4 0.2875775 
# 3: 0.0455565 3 0.2875775 
# 4: 0.5514350 2 0.2875775 
# 5: 0.2875775 1 0.2875775 
# 6: 0.5281055 5 0.7883051 
# 7: 0.8924190 4 0.7883051 
# 8: 0.8830174 3 0.7883051 
# 9: 0.4089769 2 0.7883051 
#10: 0.7883051 1 0.7883051