2017-04-22 46 views
0

嗨大家我知道我以前見過這樣的帖子,但由於某種原因,我試過的建議都沒有奏效。基本上我想要做的是從名爲「Production.Period.End.Date」的變量中取出日期,格式爲dd/mm/yyyy,並將這些日期的每個部分分成不同的對象進行分析。我這樣做的原因是採取標記爲「Period_kWh_Production」的年平均千瓦產量並追蹤該加班的變化。如果有幫助,我粘貼下面的代碼。麻煩把年份變成單獨的對象

setwd( 「C:\用戶\ fredd \收存箱\ Grad_Life \ Spring_2017 \ AFM \ Final_Paper \」)

KWTProd.df = read.csv("Merge1//Kwht_Production_07-15.csv", header=T) 

##Did this to verify "Production.Period.End.Date" 

names(KWTProd.df) 

## 
names(KWTProd.df) 
[1] "Application.Number"      
[2] "Program.Administrator"     
[3] "Program"         
[4] "Total.Cost"        
[5] "System.Owner.Sector"      
[6] "Host.Customer.Sector"     
[7] "Host.Customer.Physical.Address.City"  
[8] "Host.Customer.Physical.Address.County" 
[9] "Host.Customer.Physical.Address.Zip.Code" 
[10] "PBI.Payment.."       
[11] "Production.Period.End.Date"    
[12] "Period_kWh_Production" <-IT EXISTS ## 
## 

##Did this to plot changes of Period_kWh_Production over time## 

plot(Period_kWh_Production ~ Production.Period.End.Date, data = KWTProd.df) 

##Tried to do this to aggregate data in average## 

aggregate(Period_kWh_Production~Production.Period.End.Date,KWTProd.df,mean) 

##Still too noisy and can't find the mean by year :C## 

as.date(Production.Period.End.Date, data = KWTProd.df) 

##Says "Production.Period.End.Date" Not found BUT IT EXISTS## 

##Tried this to group and summarise by year but it says: Error in  UseMethod("mutate_") : 
no applicable method for 'mutate_' applied to an object of class "function"   ## 

summary <- df %>% 
    mutate(dates = dmy(Production.Period.End.Date), 
     year = year(Production.Period.End.Date)) %>% 
    group_by(year) %>% 
    summarise(mean = mean(x, na.rm = TRUE), 
      sd = sd(x, na.rm = TRUE)) 

##Trying this but have no clue how I am supposed to use this## 

regexpr("<dd>") 
+0

不知道太多關於代碼,但正則表達式是'\ d {2}/\ d {2}/\ d {4}' – sln

回答

0

此代碼應取決於dplyr和lubridate包。您尚未提供樣本數據。所以這沒有經過測試。

library(lubridate) 
library(dplyr) 

summary <- df %>% 
    mutate(end_date = dmy(Production.Period.End.Date), 
     production_year = year(end_date)) %>% 
    group_by(production_year) %>% 
    summarise(mean_kwH = mean(Period_kWh_Production, na.rm = TRUE), 
      sd_kwH = sd(Period_kWh_Production, na.rm = TRUE)) 
+0

我試過,但由於某種原因,我不斷收到錯誤:Error:'''in: 「summarize(mean_kwH = mean(Period_kWh_Production,na.rm = TRUE), sd_kwH = sd(Period_kWh_Production),na.rm = TRUE))」 > mutate_'應用於類「功能」 –

+0

的對象如果您將數據添加到您的問題,我們可以提供幫助。一般使用函數'dput'並粘貼結果。我建議你查看http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example。我編輯刪除了一個額外的) – epi99

+0

對不起,我做了比它更難但輸入似乎使我的控制檯爆炸數字,因爲它是一個大型的數據集。我不知道這是否有幫助,但基於您給我發送的鏈接中的評論,我使用了Paste Bin來減少顯示結果的數量,但是我仍然得到這個結果: –