2015-05-13 89 views
2

我有兩個csv文件。使用R來比較日期

一個文件列出了員工離職的時間和原因。

EmployeeID,Department,Separation_Type,Separation_Date,FYFQ  
119549,Sales,Retirement,09/30/2013 
2629053,Sales,Termination,09/30/2013 
120395,Sales,Retirement,11/01/2013 
122450,Sales,Transfer,11/30/2013 
123962,Sales,Transfer,11/30/2013 
1041054,Sales,Resignation,12/01/2013 
990962,Sales,Retirement,12/14/2013 
135396,Sales,Retirement,01/11/2014 

另一個文件是一個查找表顯示每個財政季度的開始和結束日期:

FYFQ,Start,End 
FY2014FQ1,10/1/2013,12/31/2013 
FY2014FQ2,1/1/2014,3/31/2014 
FY2014FQ3,4/1/2014,6/30/2014 
FY2014FQ4,7/1/2014,9/30/2014 
FY2015FQ1,10/1/2014,12/31/2014 
FY2015FQ2,1/1/2015,3/31/2015 

我想R鍵查找FYFQ的Separation_Date發生,並打印成第四數據中的列。

輸入:

Separations.csv: 
>EmployeeID,Department,Separation_Type,Separation_Date,FYFQ  
>990962,Sales,Retirement,12/14/2013 
>135396,Sales,Retirement,01/11/2014   

FiscalQuarterDates.csv:

>FYFQ,Start,End 
>FY2013FQ4,7/1/2013,9/30/2013 
>FY2014FQ1,10/1/2013,12/31/2013 
>FY2014FQ2,1/1/2014,3/31/2014 

所需的輸出:
Output.csv:

>EmployeeID,Department,Separation_Type,Separation_Date,FYFQ  
>990962,Sales,Retirement,12/14/2013,FY2014FQ1 
>135396,Sales,Retirement,01/11/2014,FY2014FQ2  

我假設有一些函數會遍歷FiscalQuarterDates.csv並評估每個分隔日期是否在FYFQ中,但我不確定。

有關最佳方式的任何想法?

這是什麼工作。

#read in csv and declare th3 4th column a date 
separations <- read.csv(file="Separations_DummyData.csv", head=TRUE,sep=",",colClasses=c(NA,NA,NA,"Date")) 


#Use the zoo package (I installed it) to convert separation_date to quarter type and then set the quarter back by 1/4. Then construct the variable with FYyFQq. 
library(zoo) 
separations$FYFQ <- format(as.yearqtr(separations$Separation_Date, "%m/%d/%Y") + 1/4, "FY%YFQ%q") 

#Write out this to CSV in working directory. 
write.csv(separations, file = "sepscomplete.csv", row.names = FALSE) 

回答

4

你真的不需要第二個數據幀:一個簡單的功能將解決這個問題:

yr<-with(firstdf,as.numeric(substr(Seperation_Date,7,10))) 
mth<-with(firstdf,as.numeric(substr(Seperation_Date,1,2))) 


    firstdf$FYFQ<-with(firstdf, 
ifelse(mth<=3,paste0("FY",yr,"FQ2"), 
ifelse(mth>3 & mth<=6,paste0("FY",yr,"FQ3"), 
ifelse(mth>7 & mth<=9,paste0("FY",yr,"FQ4"), 
paste0("FY",yr+1,"FQ1") 
)))) 
2

轉換每個日期"yearqtr"類(從動物園包),並添加1/4轉移到下一個日曆季度。然後使用write.csv寫出來:

library(zoo) 
DF$FYFQ <- format(as.yearqtr(DF$Separation_Date, "%m/%d/%Y") + 1/4, "FY%YFQ%q") 

捐贈:

> write.csv(DF, file = stdout(), row.names = FALSE) 
"EmployeeID","Department","Separation_Type","Separation_Date","FYFQ" 
990962,"Sales","Retirement","12/14/2013","FY2014FQ1" 
135396,"Sales","Retirement","01/11/2014","FY2014FQ2" 

注:

1)如果FYFQ不必正好在隨後顯示的格式,它可以簡化到:

DF$FYFQ <- as.yearqtr(DF$Separation_Date, "%m/%d/%Y") + 1/4 

2)該問題中列出的第二個輸入文件未被使用。

3)我們使用此輸入數據:

Lines <- "EmployeeID,Department,Separation_Type,Separation_Date,FYFQ 
990962,Sales,Retirement,12/14/2013 
135396,Sales,Retirement,01/11/2014" 

DF <- read.csv(text = Lines) 

4)固定,使得其產生移位日曆季度。

0

這個答案的正文只是另一個答案的副本,所以它已被轉移到問題。

+3

無需重複其他答案。每個人都會知道你使用哪一個答案來檢查哪一個。如果您想提供其他反饋,請使用評論或更新您的問題。 –

+0

@ G.Grothendieck「有用的東西」不只是一個答案,所以我發表了評論。 – bw1984