我有一個數據幀,它在日期列中包含DateTime值,在三個列中包含每個日期時間的計數。R中的每小時組數據幀
我想對數據進行分組小時與三列
聚合函數適用於單一的列數,但我想這樣做對整個數據幀。有小費嗎?
aggregate(DateFreq$ColA,by=list((substr(DateFreq$Date,1,13))),sum)
我有一個數據幀,它在日期列中包含DateTime值,在三個列中包含每個日期時間的計數。R中的每小時組數據幀
我想對數據進行分組小時與三列
聚合函數適用於單一的列數,但我想這樣做對整個數據幀。有小費嗎?
aggregate(DateFreq$ColA,by=list((substr(DateFreq$Date,1,13))),sum)
您可以使用aggregate
的formula
。但是你應該正確創建一個hour
變量。
dat$hour <- as.POSIXlt(dat$Date)$hour
aggregate(.~hour,data=dat,sum)
這裏一個例子:
Lines <- "Date,c1,c2,c3
06/25/2013 12:01,0,1,1
06/25/2013 12:08,-1,1,1
06/25/2013 12:48,0,1,1
06/25/2013 12:58,0,1,1
06/25/2013 13:01,0,1,1
06/25/2013 13:08,0,1,1
06/25/2013 13:48,0,1,1
06/25/2013 13:58,0,1,1
06/25/2013 14:01,0,1,1
06/25/2013 14:08,0,1,1
06/25/2013 14:48,0,1,1
06/25/2013 14:58,0,1,1"
library(zoo) ## better to read/manipulate time series
z <- read.zoo(text = Lines, header = TRUE, sep = ",",
index=0:1,tz='',
format = "%m/%d/%Y %H:%M")
dat <- data.frame(Date = index(z),coredata(z))
dat$hour <- as.POSIXlt(dat$Date)$hour
aggregate(.~hour,data=dat,sum)
hour Date c1 c2 c3
1 12 5488624500 -1 4 4
2 13 5488638900 0 4 4
3 14 5488653300 0 4 4
您可以使用dplyr
使用dplyr::group_by
和dplyr::summarise
做聚合:
library(lubridate)
library(anytime)
library(tidyverse)
Lines <- "Date,c1,c2,c3
06/25/2013 12:01,0,1,1
06/25/2013 12:08,-1,1,1
06/25/2013 12:48,0,1,1
06/25/2013 12:58,0,1,1
06/25/2013 13:01,0,1,1
06/25/2013 13:08,0,1,1
06/25/2013 13:48,0,1,1
06/25/2013 13:58,0,1,1
06/25/2013 14:01,0,1,1
06/25/2013 14:08,0,1,1
06/25/2013 14:48,0,1,1
06/25/2013 14:58,0,1,1"
setClass("myDate")
setAs("character","myDate", function(from) anytime(from))
df <- read.csv(text = Lines, header=TRUE, colClasses = c("myDate", "numeric", "numeric", "numeric"))
df %>%
group_by(Date=floor_date(Date, "1 hour")) %>%
summarize(c1=sum(c1), c2=sum(c2), c3=sum(c3))
# A tibble: 3 × 4
Date c1 c2 c3
<dttm> <dbl> <dbl> <dbl>
1 2013-06-25 12:00:00 -1 4 4
2 2013-06-25 13:00:00 0 4 4
3 2013-06-25 14:00:00 0 4 4
你應該提供數據的重複的例子。這裏的人應該複製並粘貼您的代碼並將其複製。 – agstudy
對不起,我會記住這一點。 – ganeshran