拆分變量和插入NA在

之間我有一個變量，它看起來像這樣：拆分變量和插入NA在

Var 
[1] 3, 4, 5  2, 4, 5  2, 4  1, 4, 5

我需要把它拆分成數據幀，看起來像這樣：

V1 V2 V3 V4 V5 
NA NA 3 4 5 
NA 2 NA 4 5 
NA 2 NA 4 NA 
1 NA NA 4 5

抱歉，系統我想不出找到解決我的問題的帖子。有誰知道我該怎麼做？非常感謝您提前！

編輯：我找到了一個解決方案根據您的答案，並張貼在下面。

編輯2：我使用Ananda的解決方案提高了我的代碼效率。

來源

2015-05-26 JSP

是'Var'一個'list'或'VECTOR'還是什麼？你的例子是不可重現的。它是'c（3,4,5,2,4,5,2,4,1,4,5）'還是'list（c（3,4,5），c（2,4,5）， c（2,4），c（1,4,5））或c（「3,4,5,2,4,5,2,4 1，4,5」）'？ – thelatemail

由OP的回答來看，「VAR」是一個字符串，如：

var <- c("3, 4, 5", "2, 4, 5", "2, 4", "1, 4, 5")

如果是這樣的話，你可以考慮我的「splitstackshape」包cSplit_e：

library(splitstackshape) 
cSplit_e(data.frame(var), "var", ",", mode = "value", drop = TRUE) 
# var_1 var_2 var_3 var_4 var_5 
# 1 NA NA  3  4  5 
# 2 NA  2 NA  4  5 
# 3 NA  2 NA  4 NA 
# 4  1 NA NA  4  5

如果它是list，正如其他答案所假設的那樣，您可以使用支持cSplit_e的「splitstackshape」中的（未導出）numMat函數。

var <- list(c(3,4,5), c(2,4,5), c(2,4), c(1,4,5)) 
splitstackshape:::numMat(var, mode = "value") 
#  1 2 3 4 5 
# [1,] NA NA 3 4 5 
# [2,] NA 2 NA 4 5 
# [3,] NA 2 NA 4 NA 
# [4,] 1 NA NA 4 5

引擎蓋下，numMat是一個非常類似的方法，在@ thelatemail的回答中。

如果你有-99代表NA和要排除他們，你可以嘗試：

var <- c("3, 4, 5", "2, -99, 4, 5", "2, 4", "1, 4, 5, -99") 
splitstackshape:::numMat(
    lapply(strsplit(var, ","), function(x) as.numeric(x)[as.numeric(x) > 0]), 
    mode = "value") 
#  1 2 3 4 5 
# [1,] NA NA 3 4 5 
# [2,] NA 2 NA 4 5 
# [3,] NA 2 NA 4 NA 
# [4,] 1 NA NA 4 5

來源

2015-05-29 02:27:51 A5C1D2H2I1M1N2O1R2T1

非常感謝！你的第一個解決方案工作得很好，並使我的代碼更短！ – JSP

如果我們假設您var是這似乎工作清單：

var <- list(c(3,4,5),c(2,4,5),c(2,4),c(1,4,5)) 

#define function find_num to essentially create 
#5 new functions (called closures) inside the for-loop below 
find_num <- function(x) { 
    num <- function(mylist) { 
    sapply(mylist, function(i) if(x %in% i) return(x) else return(NA)) 
    } 
} 

#initiate list 
new_list <- list() 
#find_num is initiated with 5 different values essentially (in each iteration) 
#creating 5 new functions (closures) each for the number we want 
for (i in 1:5){ 
    myfunc <- find_num(i) 
    #this creates the list we want. Each element is a column 
    new_list[[length(new_list)+1]] <- myfunc(var) 
} 

#combine the columns into a new matrix 
new_list <- do.call(cbind, new_list)

輸出：

> new_list 
    [,1] [,2] [,3] [,4] [,5] 
[1,] NA NA 3 4 5 
[2,] NA 2 NA 4 5 
[3,] NA 2 NA 4 NA 
[4,] 1 NA NA 4 5

來源

2015-05-26 23:02:13 LyzandeR

使用矩陣索引：

Var <- list(c(3,4,5),c(2,4,5),c(2,4),c(1,4,5)) 
unVar <- unlist(Var) 
out <- matrix(NA, nrow=length(Var), ncol=max(unVar)) 

out[cbind(rep(seq_along(Var),sapply(Var,length)),unVar)] <- unVar 
# and if you're using the new version of R, you can simplify a little: 
out[cbind(rep(seq_along(Var),lengths(Var)),unVar)] <- unVar 

#  [,1] [,2] [,3] [,4] [,5] 
#[1,] NA NA 3 4 5 
#[2,] NA 2 NA 4 5 
#[3,] NA 2 NA 4 NA 
#[4,] 1 NA NA 4 5

來源

2015-05-26 23:11:28 thelatemail

如果無功是隻是一個矢量然後我會做以下幾點：

Var = c(3,4,5,2,4,5,2,4,1,4,5) 
RowIdx = c(rep(1,3),rep(2,3),rep(3,2),rep(4,3)) 
DF = matrix(NA,nrow=4,ncol=5) 

for (idx in 1:length(Var)){ 
    DF[RowIdx[idx],Var[idx]] = Var[idx] 
}

當然，如果你有，你可能想找到一種方法來生成更自動化的方式行索引更多數據

來源

2015-05-26 23:17:43 Celeste

Var <- list(c(3, 4, 5), c(2, 4, 5), c(2, 4), c(1, 4, 5)) 
M <- matrix(NA, nrow=length(Var), ncol=max(sapply(Var,max))) 
for(L in seq(Var)) { M [ cbind(rep(L, length(Var[[L]])), Var[[L]]) ] <- Var[[L]]} 
M 
    [,1] [,2] [,3] [,4] [,5] 
[1,] NA NA 3 4 5 
[2,] NA 2 NA 4 5 
[3,] NA 2 NA 4 NA 
[4,] 1 NA NA 4 5

個人我的投票建議是thelatemail的版本，這是基本同構對此。

來源

2015-05-27 00:02:04

我設法根據您的回答找到解決方案！我的最終解決方案如下所示：

# I had the additional problem that my variable was a factor, therefore I had to transform it first. 
df <- data.frame(Var) 
Var <- lapply(strsplit(as.character(df$Var), ", "), "[") 
for(i in 1:length(Var)){ 
    Var[[i]] <- as.numeric(Var[[i]]) 
} 

# Then I created a matrix based on thelatemails and BondedDusts approach. 
M <- matrix(NA, nrow=length(Var), ncol=max(sapply(Var,max))) 

# Additionally, I had the problem that there were some lines with a single -99, which indicates a missing value for the complete line. I had some problems with this negative value. For this reason, I assigned NA's first. 
for(i in 1:length(Var)){ 
    Var[[i]][Var[[i]] == -99] <- NA 
} 

# Final assignment like suggested by BonedDust. 
for(L in seq(Var)) { M [ cbind(rep(L, length(Var[[L]])), Var[[L]]) ] <- Var[[L]]} 
M

我不確定這是否是最快的解決方案，但現在一切正常！非常感謝您的快速和廣泛的答案！

來源

2015-05-27 08:55:21 JSP

拆分變量和插入NA在

回答

相關問題