2017-09-30 58 views
1

我正在嘗試向現有R dataframe添加一個新列,該列將根據相應行值內的值添加一個新列。如果值是1新列值應包含one,如果值是2新列值應包含two,否則three or more將具有自定義值的列添加到數據框

此代碼:

mydf <- data.frame(a = 1:6, 
        b = rep("reproducible", 6), 
        c = rep("example", 6), 
        stringsAsFactors = FALSE) 
mydf 

呈現:

enter image description here

使用代碼:

mydf["encoded"] <- { if (mydf['a'] == 1) 'one' else if (mydf['a'] == 2) 'two' else 'three or more' } 
mydf 

呈現:

enter image description here

警告也產生:

Warning message in if (mydf["a"] == 1) "one" else if (mydf["a"] == 2) "two" else "three or more": 
「the condition has length > 1 and only the first element will be used」 

一個新列被添加到dataframe但所有值都相同:one

我還沒有實現了邏輯以正確添加新的列值?

回答

2

使用dplyr::case_when解決方案:

語法和邏輯是不言自明的:當a等於1 - encoded等於 「一個」;當a等於2 - encoded等於「2」;所有其他情況 - 編碼等於「三個或更多」。
mutate只是創建一個新列。

library(dplyr) 
mutate(mydf, encoded = case_when(a == 1 ~ "one", 
           a == 2 ~ "two", 
           TRUE ~ "three or more")) 

    a   b  c  encoded 
1 1 reproducible example   one 
2 2 reproducible example   two 
3 3 reproducible example three or more 
4 4 reproducible example three or more 
5 5 reproducible example three or more 
6 6 reproducible example three or more 

使用base::ifelse解決方案:

mydf$encoded <- ifelse(mydf$a == 1, 
         "one", 
         ifelse(mydf$a == 2, 
           "two", 
           "three or more")) 

如果你不喜歡寫mydf$a多次,你可以使用with

mydf$encoded <- with(mydf, ifelse(a == 1, 
            "one", 
            ifelse(a == 2, 
             "two", 
             "three or more"))) 
3

一個經常被忽略的功能這是否是cut功能:

mydf$encoded <- cut(mydf$a, c(0:2,Inf), c('one','two','three or more')) 

結果:

> mydf 
    a   b  c  encoded 
1 1 reproducible example   one 
2 2 reproducible example   two 
3 3 reproducible example three or more 
4 4 reproducible example three or more 
5 5 reproducible example three or more 
6 6 reproducible example three or more 
+1

這比手工編寫更好的條件:)喜歡你的答案,+1 – Wen

1

sapply也可以做的工作:

mydf$encoded <- sapply(
    mydf$a, function(a) 
     if (a == 1) 'one' else if (a == 2) 'two' else 'three or more') 
mydf 
# a   b  c  encoded 
# 1 1 reproducible example   one 
# 2 2 reproducible example   two 
# 3 3 reproducible example three or more 
# 4 4 reproducible example three or more 
# 5 5 reproducible example three or more 
# 6 6 reproducible example three or more 
相關問題