2013-04-18 65 views
-1

我有一個類別序列花費如下:計算時間類別

categoryVector <- c("1_100_1_2_3") 

我也有對應於每個類別的時間:

timeVector <- c("2013-03-07 05:16:50,617_2013-03-07 05:19:24,984_2013-03-07 05:21:06,002_2013-03-07 05:21:06,833_2013-03-07 05:21:10,713") 

我想計算花在類別的時間1和2

Time spent in category 1: (Time in 100 - Time in 1) + (Time on 2 - Time on 1) 
Time spent in category 2: Time on 3 - Time on 2 

我需要重複這些計算200K +記錄。有沒有一種有效的方式來做到這一點在R?

+1

有關更多信息,請參閱'?strsplit','?as.POSIXct'。 –

回答

0
inp <- read.table(text=gsub("_", "\n", timeVector), sep=",") 
inp$V1 <- as.POSIXct(inp$V1) 
inp2 <- read.table(text=gsub("_", "\n", categoryVector)) 

inp$diffs <- c(difftime(inp$V1[-1], inp$V1[-nrow(inp)]), NA) 
inp <- cbind(inp,inp2) 
        V1 V2 diffs V1 
1 2013-03-07 05:16:50 617 154 1 
2 2013-03-07 05:19:24 984 102 100 
3 2013-03-07 05:21:06 2  0 1 
4 2013-03-07 05:21:06 833  4 2 
5 2013-03-07 05:21:10 713 NA 3 
# should probably rename those columns 
tapply(inp$diffs, inp[,4], sum, na.rm=TRUE) 
# 1 2 3 100 
#154 4 0 102