嵌套ifelse語句

我仍然在學習如何將SAS代碼翻譯成R，並收到警告。我需要了解我犯的錯誤。我想要做的是創建一個總結和區分人口三大地位的變量：大陸，海外，外國人。我有2個變量數據庫：嵌套ifelse語句

ID國籍：idnat（法國，外國人），

如果idnat是法國人，則：

ID出生地：idbp（大陸，殖民地，海外）

我想總結一下信息米idnat和idbp到一個新的變量，名爲idnat2：

狀態：K（內地，海外，外國人）

所有論文變量使用「字符類型」。

結果預計將在列idnat2：

idnat  idbp idnat2 
1 french mainland mainland 
2 french colony overseas 
3 french overseas overseas 
4 foreign foreign foreign

這裏是我的SAS代碼，我想中的R翻譯：

if idnat = "french" then do; 
    if idbp in ("overseas","colony") then idnat2 = "overseas"; 
    else idnat2 = "mainland"; 
end; 
else idnat2 = "foreigner"; 
run;

這裏是我的R中的嘗試：

if(idnat=="french"){ 
    idnat2 <- "mainland" 
} else if(idbp=="overseas"|idbp=="colony"){ 
    idnat2 <- "overseas" 
} else { 
    idnat2 <- "foreigner" 
}

我收到這樣的警告：

Warning message: 
In if (idnat=="french") { : 
    the condition has length > 1 and only the first element will be used

有人建議我使用「嵌套ifelse」，而不是它的容易，但得到更多的警告：

idnat2 <- ifelse (idnat=="french", "mainland", 
     ifelse (idbp=="overseas"|idbp=="colony", "overseas") 
    ) 
      else (idnat2 <- "foreigner")

按照警告消息長度大於1，所以只有第一個括號之間的內容纔會被考慮。對不起，但我不明白這個長度與這裏有什麼關係？任何人都知道我錯在哪裏？

來源

2013-08-02 balour

你不應該混'ifelse'和'else'。 – Roland

@ Roland你說得對，謝謝你的建議，我只是把結果。我想要的只是在列idnat2，如果它清楚。 @KarlForner謝謝你，這正是我想用簡單的例子做的事情，但是我真的很苦惱於「R」。我試圖在SPSS上做同樣的事情，它更簡單。 – balour

我的觀點是，SO不是學習語言的替代品。有很多書籍，教程......當你被困住時，你應該在這裏發佈，並且你已經使用了所有其他資源。最好。 –

嘗試類似如下：

# some sample data 
idnat <- sample(c("french","foreigner"),100,TRUE) 
idbp <- rep(NA,100) 
idbp[idnat=="french"] <- sample(c("mainland","overseas","colony"),sum(idnat=="french"),TRUE) 

# recoding 
out <- ifelse(idnat=="french" & !idbp %in% c("overseas","colony"), "mainland", 
       ifelse(idbp %in% c("overseas","colony"),"overseas", 
        "foreigner")) 
cbind(idnat,idbp,out) # check result

你的困惑來自SAS和R如何處理的if-else結構。在R中，if和else未被矢量化，這意味着它們檢查單個條件是否爲真（即，if("french"=="french")有效）並且不能處理多個邏輯（即，if(c("french","foreigner")=="french")不起作用），並且R給出了您收到的警告。

相比之下，ifelse是向量化的，所以它可以將你的向量（aka輸入變量）和每個元素的邏輯條件進行測試，就像你在SAS中習慣的那樣。另一種解決方法是使用if和else語句來構建一個循環（正如您在這裏所做的那樣），但矢量化的ifelse方法將更有效，並且通常涉及更少的代碼。

來源

2013-08-02 08:47:40 Thomas

你好，R中的IF和ELSE都沒有矢量化，所以我得到了關於長度> 1的警告，並且只記錄了第一個TRUE參數。我會嘗試一下關於IFELSE的提示，儘管Tomas greif也是一種效率更高的方法。 – balour

如果您正在使用任何電子表格應用程序有一個基本的功能if()語法：

if(<condition>, <yes>, <no>)

語法完全爲R中ifelse()相同：

ifelse(<condition>, <yes>, <no>)

到if()中唯一的區別電子表格應用程序是R ifelse()矢量化（將矢量作爲輸入並將輸出返回給矢量）。考慮以下比較電子表格應用程序中的公式和R中的一個示例，其中我們想比較a> b，如果是，則返回1，否則返回0。

在電子表格：

A B C 
1 3 1 =if(A1 > B1, 1, 0) 
2 2 2 =if(A2 > B2, 1, 0) 
3 1 3 =if(A3 > B3, 1, 0)

在R：

> a <- 3:1; b <- 1:3 
> ifelse(a > b, 1, 0) 
[1] 1 0 0

ifelse()可以被嵌套在許多方面：

ifelse(<condition>, <yes>, ifelse(<condition>, <yes>, <no>)) 

ifelse(<condition>, ifelse(<condition>, <yes>, <no>), <no>) 

ifelse(<condition>, 
     ifelse(<condition>, <yes>, <no>), 
     ifelse(<condition>, <yes>, <no>) 
    ) 

ifelse(<condition>, <yes>, 
     ifelse(<condition>, <yes>, 
       ifelse(<condition>, <yes>, <no>) 
      ) 
     )

要計算列idnat2您可以：

df <- read.table(header=TRUE, text=" 
idnat idbp idnat2 
french mainland mainland 
french colony overseas 
french overseas overseas 
foreign foreign foreign" 
) 

with(df, 
    ifelse(idnat=="french", 
     ifelse(idbp %in% c("overseas","colony"),"overseas","mainland"),"foreign") 
    )

R Documentation

什麼是the condition has length > 1 and only the first element will be used？讓我們看看：

> # What is first condition really testing? 
> with(df, idnat=="french") 
[1] TRUE TRUE TRUE FALSE 
> # This is result of vectorized function - equality of all elements in idnat and 
> # string "french" is tested. 
> # Vector of logical values is returned (has the same length as idnat) 
> df$idnat2 <- with(df, 
+ if(idnat=="french"){ 
+ idnat2 <- "xxx" 
+ } 
+ ) 
Warning message: 
In if (idnat == "french") { : 
    the condition has length > 1 and only the first element will be used 
> # Note that the first element of comparison is TRUE and that's whay we get: 
> df 
    idnat  idbp idnat2 
1 french mainland xxx 
2 french colony xxx 
3 french overseas xxx 
4 foreign foreign xxx 
> # There is really logic in it, you have to get used to it

我還可以使用if()嗎？是的，你可以，但語法是不是很爽:)

test <- function(x) { 
    if(x=="french") { 
    "french" 
    } else{ 
    "not really french" 
    } 
} 

apply(array(df[["idnat"]]),MARGIN=1, FUN=test)

如果你熟悉SQL，您還可以在sqldfpackage使用CASEstatement。

來源

2013-08-02 12:27:37

如果沒有if和ifelse，您可以創建矢量idnat2。

功能replace可用於與"overseas"取代的"colony"所有出現：

idnat2 <- replace(idbp, idbp == "colony", "overseas")

來源

2013-08-02 16:18:25

或多或少相同：'df $ idnat2 < - df $ idbp; df $ idnat2 [df $ idnat =='colony'] < - 'overseas'' – Jaap

隨着data.table，該解決方案是：

DT[, idnat2 := ifelse(idbp %in% "foreign", "foreign", 
     ifelse(idbp %in% c("colony", "overseas"), "overseas", "mainland"))]

的ifelse被量化。 if-else不是。在這裏，DT是：

idnat  idbp 
1 french mainland 
2 french colony 
3 french overseas 
4 foreign foreign

這給：

idnat  idbp idnat2 
1: french mainland mainland 
2: french colony overseas 
3: french overseas overseas 
4: foreign foreign foreign

來源

2016-09-19 09:22:52

更好的方法是：'DT [，idnat2：= idbp] [idbp％in％c（'colony'，'overseas '），idnat2：='overseas']' – Jaap

甚至更好：'DT [，idnat2：= idbp] [idbp =='colony'，idnat2：='overseas']' – Jaap

另一個'data.table'加入一個查找表：'DT [lookup，on =。（idnat，idbp），idnat2：= i.idnat2] []' – Uwe

使用與dplyr和sqldf包SQL CASE語句：

數據

df <-structure(list(idnat = structure(c(2L, 2L, 2L, 1L), .Label = c("foreign", 
"french"), class = "factor"), idbp = structure(c(3L, 1L, 4L, 
2L), .Label = c("colony", "foreign", "mainland", "overseas"), class = "factor")), .Names = c("idnat", 
"idbp"), class = "data.frame", row.names = c(NA, -4L))

sqldf

library(sqldf) 
sqldf("SELECT idnat, idbp, 
     CASE 
      WHEN idbp IN ('colony', 'overseas') THEN 'overseas' 
      ELSE idbp 
     END AS idnat2 
     FROM df")

dplyr

library(dplyr) 
df %>% 
mutate(idnat2 = case_when(.$idbp == 'mainland' ~ "mainland", 
          .$idbp %in% c("colony", "overseas") ~ "overseas", 
         TRUE ~ "foreign"))

輸出

idnat  idbp idnat2 
1 french mainland mainland 
2 french colony overseas 
3 french overseas overseas 
4 foreign foreign foreign

來源

2017-02-08 08:33:17 mpalanco

如果數據集包含許多行可能是更有效的使用data.table查找表加入，而不是嵌套ifelse()。

下面提供

lookup

 idnat  idbp idnat2 
1: french mainland mainland 
2: french colony overseas 
3: french overseas overseas 
4: foreign foreign foreign

的查找表和試樣數據集合

library(data.table) 
n_row <- 10L 
set.seed(1L) 
DT <- data.table(idnat = "french", 
       idbp = sample(c("mainland", "colony", "overseas", "foreign"), n_row, replace = TRUE)) 
DT[idbp == "foreign", idnat := "foreign"][]

 idnat  idbp 
1: french colony 
2: french colony 
3: french overseas 
4: foreign foreign 
5: french mainland 
6: foreign foreign 
7: foreign foreign 
8: french overseas 
9: french overseas 
10: french mainland

然後同時加入我們可以做一個更新：

DT[lookup, on = .(idnat, idbp), idnat2 := i.idnat2][]

idnat idbp idnat2 1: french colony overseas 2: french colony overseas 3: french overseas overseas 4: foreign foreign foreign 5: french mainland mainland 6: foreign foreign foreign 7: foreign foreign foreign 8: french overseas overseas 9: french overseas overseas 10: french mainland mainland

來源

2017-09-29 07:47:25 Uwe

嵌套ifelse語句

回答

相關問題