2017-03-07 26 views
1

我有下面的example_df,它有4組「列」,每組有兩列。我基本上想要一個快速的方法來獲取每一組兩列,並將數據移動到結果兩列(在result_df中顯示,這是我想要結束)。任何想法如何實現這個自動化?將每兩列折成兩列的最後一集

set.seed(20) 
example_df <- data.frame("test1" = c(rnorm(6), rep(NA, 18)), 
         "test2" = c(rnorm(6), rep(NA, 18)), 
         "test3" = c(rep(NA, 6), rnorm(6), rep(NA, 12)), "test4" = c(rep(NA, 6), rnorm(6), rep(NA, 12)), 
         "test5" = c(rep(NA, 12), rnorm(6), rep(NA, 6)), "test6" = c(rep(NA, 12), rnorm(6), rep(NA, 6)), 
         "test7" = c(rep(NA, 18), rnorm(6)), "test8" = c(rep(NA, 18), rnorm(6))) 

result_df <- data.frame("total1" = c(example_df[c(1:6),1], example_df[c(7:12),3], example_df[c(13:18),5], example_df[c(19:24),7]), 
         "total2" = c(example_df[c(1:6),2], example_df[c(7:12),4], example_df[c(13:18),6], example_df[c(19:24),8])) 
+1

你是對的,一次做多。感謝您的評論和解決方案! –

回答

1

這裏有兩個選項來創建預期的輸出。

1)我們通過子集劃分 'example_df' 的交替列創建一個2列data.frame(使用邏輯索引),unlist並刪除在NAS

total1 <- na.omit(unlist(example_df[c(TRUE, FALSE)])) 
total2 <- na.omit(unlist(example_df[c(FALSE, TRUE)])) 
d1 <- data.frame(total1, total2) 
row.names(d1) <- NULL 

#checking with the OP's output 
all.equal(d1, result_df, check.attributes=FALSE) 
#[1] TRUE 

或者在一個單一的步驟

na.omit(do.call(rbind, Map(cbind, example_df[c(TRUE, FALSE)], example_df[c(FALSE, TRUE)]))) 

2)循環遍歷列在list序列,子集 'example_df',rbindlist元素與rbindlist並刪除NA

library(data.table) 
rbindlist(lapply(seq(1, ncol(example_df), by =2), function(i) 
     example_df[i:(i+1)]))[complete.cases(test1, test2)] 
2
odd_cols <- as.logical(1:ncol(example_df) %% 2) 

result_df <- data.frame(total1 = as.vector(apply(example_df[, odd_cols], 2, na.omit)), 
         total2 = as.vector(apply(example_df[,!odd_cols], 2, na.omit)))