2017-04-13 58 views
0

我已經創建了使用for循環分配條件類別的代表性數據框。使用邏輯函數替換循環與應用系列函數(或dplyr)R

df <- data.frame(Date=c("08/29/2011", "08/29/2011", "08/30/2011", "08/30/2011", "08/30/2011", "08/29/2012", "08/29/2012", "01/15/2012", "08/29/2012"), 
      Time=c("09:45", "10:00", "13:00", "13:30", "10:14", "9:09", "11:23", "17:06", "12:20"), 
      Diff = c(0.2,4.3,6.5,15.0, 16.5, 31, 30.2, 21.9, 1.9)) 

df1<- df %>% 
    mutate(Accuracy=ifelse(Diff<=3, "Excellent", "TBD")) 

for(i in 1:nrow(df1)){ 
    if(df1$Diff[i]>3&&df1$Diff[i]<=10){ 
    df1$Accuracy[i]<-"Good"} 
    if(df1$Diff[i]>10&&df1$Diff[i]<=15){ 
    df1$Accuracy[i]<-"Fair"} 
    if(df1$Diff[i]>15&&df1$Diff[i]<=30){ 
    df1$Accuracy[i]<-"Poor"} 
    if(df1$Diff[i]>30){ 
    df1$Accuracy[i]<-"Unacceptable"} 
} 

我的實際數據集是非常大的,讀數顯示爲循環通常不在R編寫最有效的方式,我相信我可以通過爲每個條件的邏輯向量做同樣的事情,並在各向量TRUE是在滿足每個條件時。然後,我可以通過子集分配值,例如df1 $ Accuracy [Good] < - 「Good」。但是,我無法弄清楚如何使用apply系列函數或dplyr函數來創建邏輯向量。 (但是,任何避免for循環的解決方案也是受歡迎的。)如果for循環是更好的方法,那麼知道這些也是有幫助的。

這是我失敗的嘗試。這些返回不正確的NA或不正確的邏輯向量。我不明白的許多事情之一是lapply知道如何遍歷列或行。

Good<-apply(df1, 1, function(x) ifelse(df1$Diff[x]>3&& df1$Diff[x]<=10, TRUE, FALSE)) #logical, TRUE where condition is true 
Good<-unlist(lapply(df1$Diff, function(x) {(ifelse(df1$Diff[x]>3&& df1$Diff[x]<=10, TRUE, FALSE))})) 

更新:嵌套ifelse語句將工作,但仍然歡迎任何有關如何使用應用程序的建議。

mutate(Accuracy=ifelse(pDiff<=3, "Excellent", 
         ifelse(pDiff>3&pDiff<=10, "Good", 
           ifelse(pDiff>10&pDiff<=15, "Fair", 
             ifelse(pDiff>15&pDiff<30, "Poor", 
               ifelse(Diff>30, "Unpublishable", "TBD")))))) 

回答

2

你可以使用case_whendplyr

df1<- df %>% 
mutate(Accuracy= case_when(
    .$Diff <= 3 ~ "Excellent", 
    .$Diff <= 10 ~ "Good", 
    .$Diff <= 15 ~ "Fair", 
    .$Diff <= 30 ~ "Poor", 
    .$Diff > 30 ~ "Unpublishable", 
    TRUE ~"TBD") 
) 

df1 
     Date Time Diff  Accuracy 
1 08/29/2011 09:45 0.2  Excellent 
2 08/29/2011 10:00 4.3   Good 
3 08/30/2011 13:00 6.5   Good 
4 08/30/2011 13:30 15.0   Fair 
5 08/30/2011 10:14 16.5   Poor 
6 08/29/2012 9:09 31.0 Unpublishable 
7 08/29/2012 11:23 30.2 Unpublishable 
8 01/15/2012 17:06 21.9   Poor 
9 08/29/2012 12:20 1.9  Excellent 
+0

你必須在你的規模錯誤從' 「廣交會」 去''到 「好」''以 「Unpublishable」'。我用'「差」替換了「好」的值。 – 5th