如何合併多個變量以在R中創建新的因子變量？

我有一項調查的數據。它來自一個問題是這樣的：如何合併多個變量以在R中創建新的因子變量？

Did you do any of the following activities during your PhD 

          Yes, paid by my school. Yes, paid by me. No. 

Attended an internationl conference? 
Bought textbooks?

的數據自動保存在電子表格中這樣說：

id conf.1 conf.2 conf.3 text.1 text.2 text.3 

1 1        1 
2   1    1 
3     1  1 
4     1     1 
5

這意味着參與者1出席了她的大學付出了會議;參加者2參加了他所支付的會議，參與者3沒有參加。

我要合併CONF.1，CONF.2和CONF.3和text.1，text.2和text.3單變量

id new.conf new.text 

1 1  2 
2 2  1 
3 3  1 
4 3  3 

where the number now respresents the categories of the survey question 

Thanks for your help

來源

2012-07-22 Bartolome Salom

這是一個重塑不合並。嘗試'reshape'（base R），'reshapeasy'（taRifx package）或'reshape2'軟件包。 – 2012-07-22 21:57:28

你沒有說明是否每個一組問題可以有多個答案。如果是這樣，這種方法可能不適合你。如果是這樣的話，我建議在繼續之前提出更多的問題reproducible。與該警告的出路，給這個一掄：

library(reshape2) 
#recreate your data 
dat <- data.frame(id = 1:5, 
        conf.1 = c(1,rep(NA,4)), 
        conf.2 = c(NA,1, rep(NA,3)), 
        conf.3 = c(NA,NA,1,1, NA), 
        text.1 = c(NA,1,1,NA,NA), 
        text.2 = c(1, rep(NA,4)), 
        text.3 = c(rep(NA,3),1, NA)) 

#melt into long format 
dat.m <- melt(dat, id.vars = "id") 
#Split on the "." 
dat.m[, c("variable", "val")] <- with(dat.m, colsplit(variable, "\\.", c("variable", "val"))) 
#Subset out only the complete cases 
dat.m <- dat.m[complete.cases(dat.m),] 
#Cast back into wide format 
dcast(id ~ variable, value.var = "val", data = dat.m) 
#----- 
    id conf text 
1 1 1 2 
2 2 2 1 
3 3 3 1 
4 4 3 3

來源

2012-07-22 22:23:41 Chase

謝謝大家的回答。 – 2012-07-23 20:30:26

這裏有一個基礎的方法，將缺失值處理：

confvars <- c("conf.1","conf.2","conf.3") 
textvars <- c("text.1","text.2","text.3") 

which.sub <- function(x) { 
maxsub <- apply(dat[x],1,which.max) 
maxsub[(lapply(maxsub,length)==0)] <- NA 
return(unlist(maxsub)) 
} 

data.frame(
id = dat$id, 
conf = which.sub(confvars), 
text = which.sub(textvars) 
)

結果：

id conf text 
1 1 1 2 
2 2 2 1 
3 3 3 1 
4 4 3 3 
5 5 NA NA

來源

2012-07-22 22:50:46 thelatemail

謝謝。我還有一個問題：是否可以將重新塑造的表格轉換爲Latex的表格，以顯示每個級別的名稱（例如1 =由我的機構贊助; 2 =由不同機構贊助; 3 =否） – 2012-07-23 21:21:16

以下解決方案非常簡單，我使用它很多。讓我們使用上述相同的數據框Chase。

dat <- data.frame(id = 1:5, 
        conf.1 = c(1,rep(NA,4)), 
        conf.2 = c(NA,1, rep(NA,3)), 
        conf.3 = c(NA,NA,1,1, NA), 
        text.1 = c(NA,1,1,NA,NA), 
        text.2 = c(1, rep(NA,4)), 
        text.3 = c(rep(NA,3),1, NA))

現在我們開始用零代替NA。

dat[is.na(dat)] <- 0

將每列乘以不同的數字可以讓我們簡單地計算新變量。

dat <- transform(dat, conf=conf.1 + 2*conf.2 + 3*conf.3, 
         text=text.1 + 2*text.2 + 3*text.3)

讓我們重新編寫零點在我們的新的變量（或這裏整個數據集），以NA並完成。

dat[dat == 0] <- NA 

> dat 
    id conf.1 conf.2 conf.3 text.1 text.2 text.3 conf text 
1 1  1  NA  NA  NA  1  NA 1 2 
2 2  NA  1  NA  1  NA  NA 2 1 
3 3  NA  NA  1  1  NA  NA 3 1 
4 4  NA  NA  1  NA  NA  1 3 3 
5 5  NA  NA  NA  NA  NA  NA NA NA

來源

2013-12-04 21:02:49

如何合併多個變量以在R中創建新的因子變量？

回答

相關問題