2014-12-03 28 views
1

對於data.table中的每個組,我想重複最小(最早)時間戳的值。請看下面的數據:R data.table:在組內選擇chron的最小值

library(chron) 
library(data.table) 
set.seed(12349870) 
time.stamp<-chron(c(10000.673,sample(10001:20000,9))) 
group<-c(rep(1,5),rep(2,5)) 
timedata<-data.table(time.stamp=time.stamp,group=group) 
timedata 

# 1: (05/19/97 16:09:07)  1 
# 2: (03/02/21 00:00:00)  1 
# 3: (02/20/15 00:00:00)  1 
# 4: (12/11/10 00:00:00)  1 
# 5: (08/23/10 00:00:00)  1 
# 6: (07/22/18 00:00:00)  2 
# 7: (06/09/23 00:00:00)  2 
# 8: (03/02/13 00:00:00)  2 
# 9: (06/04/09 00:00:00)  2 
# 10: (12/04/12 00:00:00)  2 

下運行,但是當我嘗試查看data.table,我得到一個錯誤:

timedata[,firstdata:=time.stamp[which.min(time.stamp)],by=group] 
timedata 
#Error in format.dates(x, format[[1]], origin. = origin., simplify = simplify) : 
#unknown date format 

會話信息,R版本3.1.1,chron_2。 3-45,data.table_1.9.2

+0

適合我(在1.9.5,devel版本)。可能嘗試更新到1.9.4,當前的CRAN版本? – Arun 2014-12-03 19:14:30

+0

@Arun更新至data.table_1.9.4,現在我的作業是通過引用的方式工作。謝謝。 – brorgschlr 2014-12-03 19:20:02

+0

太棒了!確保你[在1.9.4中閱讀了自動索引中的錯誤](http://stackoverflow.com/questions/26308072/operator-inconsistent-in-logical-columns-in-data-table),修正爲1.9。 5。您可以通過關注[此評論]關閉該功能(http://stackoverflow.com/questions/26308072/operator-inconsistent-in-logical-columns-in-data-table#comment41286824_26308820)。 – Arun 2014-12-03 19:24:00

回答

0

你的意思是這樣嗎?

stopifnot(sessionInfo()$otherPkgs$data.table$Version=="1.9.4") 
timedata[,firstdata:=time.stamp[which.min(time.stamp)],by=group] 
timedata 
#   time.stamp group   firstdata 
#1: (05/19/97 16:09:07)  1 (05/19/97 16:09:07) 
#2: (03/02/21 00:00:00)  1 (05/19/97 16:09:07) 
#3: (02/20/15 00:00:00)  1 (05/19/97 16:09:07) 
#4: (12/11/10 00:00:00)  1 (05/19/97 16:09:07) 
#5: (08/23/10 00:00:00)  1 (05/19/97 16:09:07) 
#6: (07/22/18 00:00:00)  2 (06/04/09 00:00:00) 
#7: (06/09/23 00:00:00)  2 (06/04/09 00:00:00) 
#8: (03/02/13 00:00:00)  2 (06/04/09 00:00:00) 
#9: (06/04/09 00:00:00)  2 (06/04/09 00:00:00) 
#10:(12/04/12 00:00:00)  2 (06/04/09 00:00:00) 
+0

對不起,我的原始MWE將時間列命名爲「日期」,但我將其更改爲「time.stamp」以避免與日期功能混淆。你真的運行過上面的嗎?我犯了同樣的錯誤。 – brorgschlr 2014-12-03 18:07:53

+0

@Arun指出我使用的是舊版本的data.table。更新到1.9.4,這個工程。 – brorgschlr 2014-12-03 19:26:08

0

以下是我想要的,儘管我更喜歡按照我的問題(例如,我必須重新命名列)嘗試引用賦值。

setkey(timedata,group,time.stamp) 
timedata<-timedata[timedata[,.SD[1],keyby=group]] 

changename<-function(dt,oldname,newname){ 
    nm<-names(dt) 
    pos<-which(nm==oldname) 
    stopifnot(length(pos)>0) 
    nm[pos]<-newname 
    setnames(dt,names(dt),nm) 
} 

changename(timedata,"time.stamp.1","firstdata") 
timedata 

# group   time.stamp   firstdata 
#1:  1 (05/19/97 16:09:07) (05/19/97 16:09:07) 
#2:  1 (08/23/10 00:00:00) (05/19/97 16:09:07) 
#3:  1 (12/11/10 00:00:00) (05/19/97 16:09:07) 
#4:  1 (02/20/15 00:00:00) (05/19/97 16:09:07) 
#5:  1 (03/02/21 00:00:00) (05/19/97 16:09:07) 
#6:  2 (06/04/09 00:00:00) (06/04/09 00:00:00) 
#7:  2 (12/04/12 00:00:00) (06/04/09 00:00:00) 
#8:  2 (03/02/13 00:00:00) (06/04/09 00:00:00) 
#9:  2 (07/22/18 00:00:00) (06/04/09 00:00:00) 
#10: 2 (06/09/23 00:00:00) (06/04/09 00:00:00)