2016-11-02 16 views
2

我有一個數據庫,其中包含幾個客戶訂單。訂單的訂購時間越來越長。我告訴你這些客戶中的兩種:當一行發生時在數據框中標記一列

data<-data.frame(ID_CLIENTE=c(rep(1,8),rep(2,8)), 
    PEDIDO=c("A1","A2","A3","A4","A5","A6","A7","A8", "B1","B2","B3","B4","B5","B6","B7","B8"), LABEL= c(NA, NA, "1ER_PEDIDO", NA, NA, NA, NA, NA, NA, NA, NA, NA, "1ER_PEDIDO", NA, NA, NA), 
    DATE= as.Date(c("2014-09-22","2014-12-16","2015-01-19","2015-03-11", "2015-05-18", "2015-10-28","2016-04-13","2016-06-09","2014-10-08","2014-10-12","2014-10-26","2014-11-06","2014-11-24","2014-12-10","2014-12-11","2015-01-12"))) 

     > data 
      ID_CLIENTE PEDIDO  LABEL  DATE 
     1   1  A1  <NA> 2014-09-22 
     2   1  A2  <NA> 2014-12-16 
     3   1  A3 1ER_PEDIDO 2015-01-19 
     4   1  A4  <NA> 2015-03-11 
     5   1  A5  <NA> 2015-05-18 
     6   1  A6  <NA> 2015-10-28 
     7   1  A7  <NA> 2016-04-13 
     8   1  A8  <NA> 2016-06-09 
     9   2  B1  <NA> 2014-10-08 
     10   2  B2  <NA> 2014-10-12 
     11   2  B3  <NA> 2014-10-26 
     12   2  B4  <NA> 2014-11-06 
     13   2  B5 1ER_PEDIDO 2014-11-24 
     14   2  B6  <NA> 2014-12-10 
     15   2  B7  <NA> 2014-12-11 
     16   2  B8  <NA> 2015-01-12 

我要標註所有的訂單放在前標有「1ER_PEDIDO」命令後。結果數據幀必須如下:

ID_CLIENTE PEDIDO  LABEL  DATE 
1   1  A1  BEFORE 2014-09-22 
2   1  A2  BEFORE 2014-12-16 
3   1  A3 1ER_PEDIDO 2015-01-19 
4   1  A4  AFTER 2015-03-11 
5   1  A5  AFTER 2015-05-18 
6   1  A6  AFTER 2015-10-28 
7   1  A7  AFTER 2016-04-13 
8   1  A8  AFTER 2016-06-09 
9   2  B1  BEFORE 2014-10-08 
10   2  B2  BEFORE 2014-10-12 
11   2  B3  BEFORE 2014-10-26 
12   2  B4  BEFORE 2014-11-06 
13   2  B5 1ER_PEDIDO 2014-11-24 
14   2  B6  AFTER 2014-12-10 
15   2  B7  AFTER 2014-12-11 
16   2  B8  AFTER 2015-01-12 

我應該使用data.table函數嗎?我必須由客戶標出所有訂單,並且我必須修理客戶並檢查所有訂單maden。然後,我想標註它們。

+1

有沒有總是精確1個階每個客戶? –

+0

是@docendodiscimus。每個客戶可以進行多個訂單,但只有一個被標記爲「1ER_PEDIDO」。在這種情況下,客戶「1」取得了8個訂單,而客戶「2」取得了8個訂單。 –

回答

3

這裏有兩個步驟data.table方法:

library(data.table) 
setDT(data) 

data[data[, DATE < DATE[LABEL == "1ER_PEDIDO" & !is.na(LABEL)], by = ID_CLIENTE]$V1, 
    LABEL := "BEFORE"] 

data[data[, DATE > DATE[LABEL == "1ER_PEDIDO" & !is.na(LABEL)], by = ID_CLIENTE]$V1, 
    LABEL := "AFTER"] 
+0

它的工作原理!非常感謝! –

+0

你可以讓data.table方法不依賴於DATE嗎?我認爲這就是dplyr解決方案的兩倍。 – Irakli

1

這是一個dplyr的解決方案,它基於@nicola和@akrun答案my question

library(tidyverse) 
data %>% 
group_by(ID_CLIENTE) %>% 
mutate(LABEL=c('BEFORE','1ER_PEDIDO','AFTER') 
       [sign(seq_along(LABEL)-match('1ER_PEDIDO', LABEL))+2]) 
相關問題