如何選擇基於鑑於data.table其他列

條件的行，如何選擇基於鑑於data.table其他列

library(data.table)  
dt <- data.table(Year=c(rep(2014,1,8), 2015, 2014, 2014), no=c(111,111,111,222,222,333,333,444,555,666,666), type=c('a','b','c','a','a','a','f','a', 'a', 'c','f'))

回報，

Year no type 
1: 2014 111 a 
2: 2014 111 b 
3: 2014 111 c 
4: 2014 222 a 
5: 2014 222 a 
6: 2014 333 a 
7: 2014 333 f 
8: 2014 444 a 
9: 2015 555 a 
10: 2014 666 c 
11: 2014 666 f

我想過濾掉任何no不同時包含「一個'和其他人（'b'，'c'等）。這意味着，id 222,444和666將被過濾掉。需要注意的是no 555過濾，因爲年度出2015年

我期望回報是

Year no type 
1: 2014 111 a 
2: 2014 111 b 
3: 2014 111 c 
4: 2014 333 a 
5: 2014 333 f

然後，我們用unique終於得到no 111和333作爲我們的最終結果。

我曾嘗試以下：

setkey(dt, Year) 
dt1 <- dt[J(2014)][,.(type=unique(type)), by = no] 
unique(na.omit(merge(dt1[type=='a'],dt1[type!='a'], by = 'no', all = T))[,no])

不過，我覺得這個代碼是沒有效率的。您能否給我建議？

來源

2016-01-14 newbie

如何：

dt[Year == 2014, if("a" %in% type & uniqueN(type) > 1) .SD, by = no] 
# no Year type 
#1: 111 2014 a 
#2: 111 2014 b 
#3: 111 2014 c 
#4: 333 2014 a 
#5: 333 2014 f

或者，因爲你只在唯一no很感興趣：

dt[Year == 2014, "a" %in% type & uniqueN(type) > 1, by = no][(V1), no] 
#[1] 111 333

在情況下，有可能NA在你的類型的列，你不想要算作其他值，可以修改爲：

dt[Year == 2014, "a" %in% type & uniqueN(na.omit(type)) > 1, by = no][(V1), no] 
#[1] 111 333

來源

2016-01-14 09:58:02

是什麼'if'後'.SD'應用？ – newbie

'.SD'是子集data.table（每組） –

我們也可以使用any

res <- dt[Year==2014, if(any(type=="a") & any(type!="a")) .SD, no] 
res 
# no Year type 
#1: 111 2014 a 
#2: 111 2014 b 
#3: 111 2014 c 
#4: 333 2014 a 
#5: 333 2014 f 

unique(res$no) 
#[1] 111 333

同樣的方法可以用dplyr

library(dplyr) 
dt %>% 
    group_by(no) %>% 
    filter(any(type=="a") & any(type!="a") & Year==2014)

來源

2016-01-14 10:42:55 akrun

如何選擇基於鑑於data.table其他列

回答

相關問題