0
列
我有一個數據幀,看起來像這樣:[R天
df_raw <- structure(list(date = structure(c(17075, 17076, 17077, 17108,
17109, 17110, 17111, 17112, 17113, 17221, 17222, 17223, 17224,
17225, 17226, 17227, 17228, 17229, 17230, 17231, 17232, 17286,
17075, 17076, 17077, 17078, 17079, 17080, 17081, 17082, 17083,
17084, 17085, 17086, 17087, 17088, 17089, 17090, 17091), class = "Date"),
Req_BU = c("12018", "12018", "12018", "12018", "12018", "12018",
"12018", "12018", "12018", "12018", "12018", "12018", "12018",
"12018", "12018", "12018", "12018", "12018", "12018", "12018",
"12018", "12018", "14004", "14004", "14004", "14004", "14004",
"14004", "14004", "14004", "14004", "14004", "14004", "14004",
"14004", "14004", "14004", "14004", "14004"), last_rec_date = c(1L,
1L, 1L, 1L, 1L, NA, NA, 3L, 1L, 1L, 1L, NA, 2L, 1L, 1L, 1L,
1L, 1L, NA, NA, 3L, 1L, NA, NA, 1L, 1L, 1L, 1L, 1L, NA, NA,
3L, 1L, 1L, 1L, 1L, NA, 2L, 1L)), .Names = c("date", "Req_BU",
"last_rec_date"), row.names = c(NA, -39L), class = "data.frame")
> head(df_raw, 10)
date Req_BU last_rec_date
1 2016-10-01 12018 1
2 2016-10-02 12018 1
3 2016-10-03 12018 1
4 2016-11-03 12018 1
5 2016-11-04 12018 1
6 2016-11-05 12018 NA
7 2016-11-06 12018 NA
8 2016-11-07 12018 3
9 2016-11-08 12018 1
10 2017-02-24 12018 1
> df_raw[22:30, ]
date Req_BU last_rec_date
22 2017-04-30 12018 1
23 2016-10-01 14004 NA
24 2016-10-02 14004 NA
25 2016-10-03 14004 1
26 2016-10-04 14004 1
27 2016-10-05 14004 1
28 2016-10-06 14004 1
29 2016-10-07 14004 1
30 2016-10-08 14004 NA
我需要做的就是,因爲天數替換last_rec_date
列NA
值最後一個非NA
。這一切都需要根據名爲Req_BU
的分組變量完成。我的數據從2016年10月1日開始,如果某個特定的Req_BU
以該日期的NA
開頭,則需要填寫1
並繼續執行此操作,直到存在正常邏輯接管的非NA
值。
我在找這樣的東西。
> head(df_hope, 10)
date Req_BU last_rec_date
1 2016-10-01 12018 1
2 2016-10-02 12018 1
3 2016-10-03 12018 1
4 2016-11-03 12018 1
5 2016-11-04 12018 1
6 2016-11-05 12018 1
7 2016-11-06 12018 2
8 2016-11-07 12018 3
9 2016-11-08 12018 1
10 2017-02-24 12018 1
> df_hope[22:30, ]
date Req_BU last_rec_date
22 2017-04-30 12018 1
23 2016-10-01 14004 1
24 2016-10-02 14004 1
25 2016-10-03 14004 1
26 2016-10-04 14004 1
27 2016-10-05 14004 1
28 2016-10-06 14004 1
29 2016-10-07 14004 1
30 2016-10-08 14004 1
我試過了,但它甚至沒有處理我需要的邏輯的第一部分。
library(dplyr)
df_not_working <- df_raw %>%
group_by(Req_BU) %>%
mutate(last_rec_date = ifelse(is.na(last_rec_date),
c(NA, diff(date)),
last_rec_date))
> df_not_working
Source: local data frame [39 x 3]
Groups: Req_BU [2]
# A tibble: 39 x 3
date Req_BU last_rec_date
<date> <chr> <dbl>
1 2016-10-01 12018 1
2 2016-10-02 12018 1
3 2016-10-03 12018 1
4 2016-11-03 12018 1
5 2016-11-04 12018 1
6 2016-11-05 12018 1
7 2016-11-06 12018 1
8 2016-11-07 12018 3
9 2016-11-08 12018 1
10 2017-02-24 12018 1
分析的其餘部分是相當dplyr
重,所以我確定使用或鹼性溶液(如果存在)。謝謝。