2017-08-04 84 views
0

我試圖根據Year,Month或Date級別聚合Profit。我正在閱讀另一個文件的聚合級別,並希望將該文件中的值傳遞給聚合函數,但它會引發錯誤。將動態值傳遞給R中的聚合函數

library(lubridate) 

parameter <- read.csv("Parameter.csv",header = F,col.names = c("Option","Value")) 
head(parameter) 
orders <- read.csv("Orders_Data.csv") 
str(orders) 

orders$Order.Date <- as.POSIXct(orders$Order.Date,format ="%m/%d/%Y") 
orders$month = months(orders$Order.Date) 
orders$Year <- year(orders$Order.Date) 
head(orders$Year) 


option = as.character(parameter[1,2]) #option holds the level of aggregate 
option 

#[1] "Day" 

aggregate(Profit ~ Category + option ,data = orders, sum) 

錯誤是

Error in model.frame.default(formula = Profit ~ Category + option, data = orders) : 
    variable lengths differ (found for 'option') 

這裏是重複性的數據

option = "Year" 

aggregate(Profit ~ Category + option ,data = orders, sum) 

example = data.frame(date = sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 24) 
        ,Profit = sample(seq(-200,1200),24) 
        , Department = sample(LETTERS[seq(from = 1, to = 26)],24)) 


example$Year <- year(example$date) 
head(example) 
aggregate(Profit ~ Department + option,data = example, sum) 

還是同樣的錯誤

+0

請提供一個小的可重複的例子和預期產出。如果您使用的是單個元素「選項」,則不起作用。您可能需要在數據集 – akrun

回答

1

總之,你需要手動創建一個字符串公式,然後變換它到一個實際的公式,然後通過它彙總。

像這樣:

option="Year" 
formula=as.formula(paste0("Profit ~ Department + ",option)) 
aggregate(formula,data = example, sum) 

不過,我覺得用data.table會更容易(和更快!):

library(data.table) 
example=data.table(example) 

example[,.(Profit=sum(Profit)),by=c("Department",option)] 
+0

這非常好和容易 –