2017-04-18 27 views
0

我一直在嘗試在組內複製1和2的二進制輸出。 我想利用repdplyr,但我似乎無法理解如何在組內應用rep。我已經能夠通過手動分開分組併爲每個分組指定正確的範圍來完成。我想知道如何使用dplyr來應用rep通過dplyr在組內應用rep()

下面是一個示例數據。

df <- data.frame(date = c("2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02"), 
       loc =c("AB", "AB", "AB", "AB", "AB", "AB", "AB", "AB", "CD", "CD", "CD", "CD", "CD", "CD", "CD", "CD", "CD", "CD"), 
       cat = c("a", "a", "a", "b", "b", "b", "b", "b", "c", "c", "c", "c", "c", "d", "d", "d", "d", "d")) 

這基本上是我在每個分組上運行的代碼應用於整個數據集的代碼。

df$type <- rep(1:2,nrow(df)/2) 

正如你所看到的,輸出忽略列catcat b & d應在1

  date loc cat type 
1 2017-01-01 AB a 1 
2 2017-01-01 AB a 2 
3 2017-01-01 AB a 1 
4 2017-01-01 AB b 2 
5 2017-01-01 AB b 1 
6 2017-01-01 AB b 2 
7 2017-01-01 AB b 1 
8 2017-01-02 AB b 2 
9 2017-01-02 CD c 1 
10 2017-01-02 CD c 2 
11 2017-01-02 CD c 1 
12 2017-01-02 CD c 2 
13 2017-01-02 CD c 1 
14 2017-01-02 CD d 2 
15 2017-01-02 CD d 1 
16 2017-01-02 CD d 2 
17 2017-01-02 CD d 1 

更新已經開始: 下面是所需的輸出。

 date loc cat type 
1 2017-01-01 AB a 1 
2 2017-01-01 AB a 2 
3 2017-01-01 AB a 1 
4 2017-01-01 AB b 1 
5 2017-01-01 AB b 2 
6 2017-01-01 AB b 1 
7 2017-01-01 AB b 2 
8 2017-01-02 AB b 1 
9 2017-01-02 CD c 1 
10 2017-01-02 CD c 2 
11 2017-01-02 CD c 1 
12 2017-01-02 CD c 2 
13 2017-01-02 CD c 1 
14 2017-01-02 CD d 1 
15 2017-01-02 CD d 2 
16 2017-01-02 CD d 1 
17 2017-01-02 CD d 2 
+1

在基地,'DF $類型< - AVE(SEQ(nrow(DF )),df $ cat,FUN = function(x){rep(1:2,length.out = length(x))})'或者如果你先用'cat'排序,'unlist(lapply(table $ cat),function(x){rep(1:2,length.out = x)}))' – alistaire

回答

1

假設cat是這裏唯一相關的分組變量(沒有日期和LOC),你可以這樣做:

library(dplyr) 
df = df %>% 
    group_by(cat) %>% 
    mutate(type = rep(1:2, length.out = length(cat))) 
# Output: 
     date loc cat type 
     <fctr> <fctr> <fctr> <int> 
1 2017-01-01  AB  a  1 
2 2017-01-01  AB  a  2 
3 2017-01-01  AB  a  1 
4 2017-01-01  AB  b  1 
5 2017-01-01  AB  b  2 
6 2017-01-01  AB  b  1 
7 2017-01-01  AB  b  2 
8 2017-01-02  AB  b  1 
9 2017-01-02  CD  c  1 
10 2017-01-02  CD  c  2 
11 2017-01-02  CD  c  1 
12 2017-01-02  CD  c  2 
13 2017-01-02  CD  c  1 
14 2017-01-02  CD  d  1 
15 2017-01-02  CD  d  2 
16 2017-01-02  CD  d  1 
17 2017-01-02  CD  d  2 
18 2017-01-02  CD  d  1 
+0

Thanks @Marius解決了這個問題。 – JnrfL

+2

你可以使用'length.out = n()' – alistaire