dplyr中group_by層次結構中的計數級別

我在R中有一個大型數據集，它由多個嵌套在組中的個案組成的記錄組織。一種玩具例子是在這裏：dplyr中group_by層次結構中的計數級別

d = data.frame(group = rep(c('control','patient'), each = 5), case = c('a', 'a', 'b', 'c', 'c', 'd','d','d','e','e'))

如果在一個dplyr鏈，被施加group_by(group, case)，如何能列被創建，數字的每一行與該組內的情況下的順序？例如在下面的例子中，在第三列中，案例'a'是對照組中的第一個案例，案例'c'是第三個，但是對於案例'd'，患者組中的第一個案例的編號重置爲1 。

group case number 
control a 1 
control a 1 
control b 2 
control c 3 
control c 3 
patient d 1 
patient d 1 
patient d 1 
patient e 2 
patient e 2

我可以看到這是如何通過計算使用「for」循環的情況下進行的，但我想知道如果有一種方法來操作的標準dplyr式鏈內實現這一目標？

來源

2017-06-06 Michael MacAskill

'd％>％GROUP_BY（組）％>％突變（數=匹配（情況下，唯一的（case）））' –

@docendodiscimus非常優雅。如果這是一個答案，我會接受它... –

group_by(d, group) %>% 
    mutate(number= droplevels(case) %>% as.numeric)

來源

2017-06-06 05:29:15

哇，這很好。 –

我們可以使用data.table

library(data.table) 
setDT(d)[, numbers := as.numeric(factor(case, levels = unique(case))), group]

來源

2017-06-06 06:05:32 akrun

謝謝。雖然通常我更喜歡堅持（我發現是）更可讀的dplyr語法，但這很好。 –

一個解決方案是：

library(dplyr) 
library(tibble) 

want<-left_join(d, 
       d %>% 
        distinct(case) %>% 
        rownames_to_column(var="number") , 
       by="case") 

# .. added later: 
want2<-left_join(d, 
       bind_rows(
        d %>% 
        filter(group=="control") %>% 
        distinct(case) %>% 
        rownames_to_column(var="number"), 
        d %>% 
        filter(group=="patient") %>% 
        distinct(case) %>% 
        rownames_to_column(var="number")), 
        by="case") 

# I think this is less readable: 
want3<-left_join(d, 
       bind_rows(by(d,d$group,function(x) x %>% 
           distinct(case) %>% 
           rownames_to_column(var="number"))), 
       by="case")

來源

2017-06-06 06:17:08

不錯，但雖然它通過使用rownames獲得部分路徑（編號的情況），但編號不會在組變量的每個級別重新啓動。 –

重讀那部分.. –

dplyr中group_by層次結構中的計數級別

回答

相關問題