o 'rbindlist' gains 'use.names' and 'fill' arguments and is now implemented
entirely in C. Closes #5249
-> use.names by default is FALSE for backwards compatibility (doesn't bind by
names by default)
-> rbind(...) now just calls rbindlist() internally, except that 'use.names'
is TRUE by default, for compatibility with base (and backwards compatibility).
-> fill by default is FALSE. If fill is TRUE, use.names has to be TRUE.
-> At least one item of the input list has to have non-null column names.
-> Duplicate columns are bound in the order of occurrence, like base.
-> Attributes that might exist in individual items would be lost in the bound result.
-> Columns are coerced to the highest SEXPTYPE, if they are different, if/when possible.
-> And incredibly fast ;).
-> Documentation updated in much detail. Closes DR #5158.
檢查this post爲基準。
實例:
1)使用的rbindlist
fill
論點:
DT1 <- data.table(x=1, y=2)
DT2 <- data.table(y=2, z=-1)
rbindlist(list(DT1, DT2), fill=TRUE)
# x y z
# 1: 1 2 NA
# 2: NA 2 -1
注意,當fill=TRUE
,use.names
應TRUE
。
2)具有重複名稱的適當綁定表:
DT1 <- data.table(x=1, x=2, y=1, y=2)
DT2 <- data.table(y=3, y=-1, y=-2)
rbindlist(list(DT1, DT2), fill=TRUE)
# x x y y y
# 1: 1 2 1 2 NA
# 2: NA NA 3 -1 -2
3)它不僅限於data.tables
,但它只是對data.frames
和lists
還有:
DT1 <- data.table(x=1, y=2)
DT2 <- data.frame(y=2, z=-1)
DT3 <- list(z=10)
rbindlist(list(DT1,DT2,DT3), fill=TRUE)
# x y z
# 1: 1 2 NA
# 2: NA 2 -1
# 3: NA NA 10
4)如果你想用的名字只是綁定,可以設置只use.names=TRUE
,但不fill
:
DT1 <- data.table(x=1, y=2)
DT2 <- data.table(y=1, x=2)
rbindlist(list(DT1,DT2), use.names=TRUE, fill=FALSE)
# x y
# 1: 1 2
# 2: 2 1
DT1 <- data.table(x=1, y=2)
DT2 <- data.table(z=2, y=1)
# returns error when fill=FALSE but can't be bound without fill=TRUE
rbindlist(list(DT1, DT2), use.names=TRUE, fill=FALSE)
# Error in rbindlist(list(DT1, DT2), use.names = TRUE, fill = FALSE) :
# Answer requires 3 columns whereas one or more item(s) in the input
# list has only 2 columns. ...
5)默認值是向後兼容(use.names=FALSE
相同,fill=FALSE
):
DT1 <- data.table(x=1, y=2)
DT2 <- data.table(y=1, x=2)
rbindlist(list(DT1, DT2))
# x y
# 1: 1 2
# 2: 1 2
HTH
退房'rbind.fill {plyr}' – TheComeOnMan
在這種情況下,你可以使用'合併(DT1,DT 2,by =「A」,all = TRUE)'。但這隻和'rbind'一樣,因爲A是唯一的。否則,如果你在每個'data.table'中添加一個唯一的id,你仍然可以使用'merge'。 – shadow
我發佈這個問題的原因是因爲我讀了rbindlist比rbind快得多。但是也許rbind.fill仍然是最好的方法。另外,不合並假設是非常低效的,因爲它做了很多檢查? –