2015-05-14 16 views
-1

我試圖跨越兩列,實現了簡單的字符串比較(輕慢了)數據的樣本:!基於R處理CSV IF((可樂= COLB)考慮到評估COLC

EMPLID,From_DeptCode,FromDept,To_DeptCode,To_Dept,TransactionTypeCode,TransactionType,EffectiveDate,ChangeType 
0239583290,21,Sales,43,CustomerService,10,Promotion,12/12/2012 
1230495829,21,Sales,21,Sales,10,Promotion,9/1/2013 
4059503918,93,Operations,93,Operations,10,Demotion,11/18/2014 
3040593021,19,Headquarters,23,International,11,Reorg,12/13/2011 
7029406920,15,Marketing,84,Development,19,Reassignment,01/05/2010 
2039052819,19,Headquarters,19,Headquarters,10,Promotion,4/15/2015 

我想用的邏輯是:

If From_DeptCode = To_DeptCode 
     then ChangeType="No Change" 
ElseIf From_DeptCode != To_DeptCode AND TransactionType = "Reorg" 
     then ChangeType="Reorg" 
Else ChangeType="Transfer" 

所以我的輸出看起來像:

EMPLID,From_DeptCode,FromDept,To_DeptCode,To_Dept,TransactionTypeCode,TransactionType,EffectiveDate,ChangeType 
0239583290,21,Sales,43,CustomerService,10,Promotion,12/12/2012,Transfer 
1230495829,21,Sales,21,Sales,10,Promotion,9/1/2013,No Change 
4059503918,93,Operations,93,Operations,10,Demotion,11/18/2014,No Change 
3040593021,19,Headquarters,23,International,11,Reorg,12/13/2011,Reorg 
7029406920,15,Marketing,84,Development,19,Reassignment,01/05/2010,Transfer 
2039052819,19,Headquarters,19,Headquarters,10,Promotion,4/15/2015,No Change 

這是我目前所知:

transfers <- read.csv(file="Transfers.csv", head=TRUE, 
    sep=",",colClasses=c(NA,NA,NA,NA,NA,NA,NA,"Date",NA)) 

在這一點上,我想,我認爲,實現我的邏輯:

If From_DeptCode = To_DeptCode 
     then ChangeType="No Change" 
ElseIf From_DeptCode != To_DeptCode AND TransactionType = "Reorg" 
     then ChangeType="Reorg" 
Else ChangeType="Transfer" 

我認爲在這裏我會寫我的新的CSV write.csv(傳輸,文件=「 transfersprocessed.csv「,row.names = FALSE)

任何有關獲得剩餘路徑的建議嗎?

更新:

每從@josilber答案,我跑到下面的代碼:

transfers <- read.csv(file="Transfers.csv", head=TRUE, sep=",", colClasses=c(NA,NA,NA,NA,NA,NA,NA,"Date",NA)) 

dat$ChangeType <- ifelse(dat$From_DeptCode == dat$To_DeptCode, "No Change",ifelse(dat$TransactionType == "Reorg", "Reorg", "Transfer")) 

View(transfers) 

以下數據:

EMPLID,From_DeptCode,FromDept,To_DeptCode,To_Dept,TransactionTypeCode,TransactionType,EffectiveDate,ChangeType 
0239583290,21,Sales,43,CustomerService,10,Promotion,12/12/2012 
1230495829,21,Sales,21,Sales,10,Promotion,9/1/2013 
4059503918,93,Operations,93,Operations,10,Demotion,11/18/2014 
3040593021,19,Headquarters,23,International,11,Reorg,12/13/2011 
7029406920,15,Marketing,84,Development,19,Reassignment,01/05/2010 
2039052819,19,Headquarters,19,Headquarters,10,Promotion,4/15/2015 

而中ChangeType變量仍然是 「不適用」。

嵌套的ifelse語句語法是否正確?任何想法爲什麼ChangeType不起作用?

回答

3

您可以嵌套ifelse語句來做到這一點:

dat$ChangeType <- ifelse(dat$From_DeptCode == dat$To_DeptCode, "No Change", 
         ifelse(dat$TransactionType == "Reorg", "Reorg", "Transfer")) 
dat 
#  EMPLID From_DeptCode  FromDept To_DeptCode   To_Dept TransactionTypeCode 
# 1 239583290   21  Sales   43 CustomerService     10 
# 2 1230495829   21  Sales   21   Sales     10 
# 3 4059503918   93 Operations   93  Operations     10 
# 4 3040593021   19 Headquarters   23 International     11 
# 5 7029406920   15 Marketing   84  Development     19 
# 6 2039052819   19 Headquarters   19 Headquarters     10 
# TransactionType EffectiveDate ChangeType 
# 1  Promotion 12/12/2012 Transfer 
# 2  Promotion  9/1/2013 No Change 
# 3  Demotion 11/18/2014 No Change 
# 4   Reorg 12/13/2011  Reorg 
# 5 Reassignment 01/05/2010 Transfer 
# 6  Promotion  4/15/2015 No Change 

ifelse傳遞TRUE/FALSE值的向量作爲第一個參數,使用爲TRUE的情況下,第二個參數,並使用第三個參數對於FALSE情況。對於你的錯誤情況,你實際上想要運行另一個ifelse,這就是爲什麼邏輯嵌套在這裏。

請注意,對於大型數據框而言,這將比循環遍歷數據和一次一行地執行嵌套if語句快得多。

+0

不錯的解決方案。很高興看到R如何自動嘗試矢量化。在這種情況下,不需要告訴R要查看哪一行。真棒! –

+0

如果不是在dat $ From_DeptCode == dat $ To_DeptCode打印「無變化」,我希望它通過或不打印該行。我仍然會使用這種格式嗎? – bw1984

+0

@ user1694958這可以通過首先根據變量的值生成ChangeType變量和子集(例如像'dat2 < - subset(dat,ChangeType!=「No Change」)')來完成。 – josliber