2016-04-13 30 views
-1

子集A R data.table我有一些什麼同樣的問題Subsetting a data.table using another data.tableSubset a data.table by matching columns of another data.table使用其他data.table

dt是一樣的。

dt 

    id year event 
1: 2 2005  1 
2: 2 2006  1 
3: 2 2007  1 
4: 4 2008  1 
5: 4 2009  1 
6: 2 2005  0 
7: 4 2006  0 
8: 4 2007  0 
9: 2 2008  0 

dt <- data.table(id = c(2,2,2,4,4,2,4,4,2), year = c(2005:2009,2005:2008), 
       event = rep(1:0, times=c(5, 4))) 

但是,該dt1是不同

dt1 

    year performance event 
1: 2005  1000  1 
2: 2006  1001  1 
3: 2007  1002  1 
4: 2008  1003  1 
5: 2009  1004  1 
6: 2005  1005  0 
7: 2006  1006  0 
8: 2007  1007  0 
9: 2008  1008  0 

dt1 <- data.table(year = c(2005:2009,2005:2008), performance = 1000:1008, 
        event = rep(1:0, times=c(5, 4))) 

一點點我想基於由事件dtid和組分裂dt1。期望的輸出將是兩個不同的data.tables:

dt1.sub1 
    year performance event 
1: 2005  1000  1 
2: 2006  1001  1 
3: 2007  1002  1 
4: 2005  1005  0 
5: 2008  1008  0 


dt1.sub2 
    year performance event 
1: 2008  1003  1 
2: 2009  1004  1 
3: 2006  1006  0 
4: 2007  1007  0 

有沒有辦法實現這一點,而不使用合併?

+0

沒有被創建,我犯了一個錯誤,一切都在'dt'相同,'dt'除'dt'有一個額外的'id'列。我想根據'dt'的'id'分割'dt1'。 –

+0

不,我不這麼認爲 –

+2

你應該編輯/清理你的問題。目前還不清楚你在問什麼。 – jangorecki

回答

2

我們可以使用split創建'data.tables'的list

lst <- split(dt1, dt$id) 
names(lst) <- paste0('dt1.sub', seq_along(lst)) 
lst 
#$dt1.sub1 
# year performance event 
#1: 2005  1000  1 
#2: 2006  1001  1 
#3: 2007  1002  1 
#4: 2005  1005  0 
#5: 2008  1008  0 

#$dt1.sub2 
# year performance event 
#1: 2008  1003  1 
#2: 2009  1004  1 
#3: 2006  1006  0 
#4: 2007  1007  0 

最好在list內工作。但是,如果真的需要它,然後分開data.table對象可以在全球環境與list2env

list2env(lst, envir = .GlobalEnv) 
2
dt[dt1, on = c('year', 'event')][, .(list(.SD)), by = id]$V1 
#[[1]] 
# year event performance 
#1: 2005  1  1000 
#2: 2006  1  1001 
#3: 2007  1  1002 
#4: 2005  0  1005 
#5: 2008  0  1008 
# 
#[[2]] 
# year event performance 
#1: 2008  1  1003 
#2: 2009  1  1004 
#3: 2006  0  1006 
#4: 2007  0  1007