如何在R或Excel中重塑數據框？

下面的代碼以獲得樣本數據集：如何在R或Excel中重塑數據框？

set.seed(0) 
practice <- matrix(sample(1:100, 20), ncol = 2) 
data <- as.data.frame(practice) 
data <- cbind(lob = sprintf("objective%d", rep(1:2,each=5)), data) 
data <- cbind(student = sprintf("student%d", rep(1:5,2)), data) 
names(data) <- c("student", "learning objective","attempt", "score") 
data[-8,]

的數據是這樣的：

student learning objective attempt score 
1 student1   objective1  90  6 
2 student2   objective1  27 19 
3 student3   objective1  37 16 
4 student4   objective1  56 60 
5 student5   objective1  88 34 
6 student1   objective2  20 66 
7 student2   objective2  85 42 
9 student4   objective2  61 82 
10 student5   objective2  58 31

我要的是：

student  objective1   objective2 
       attempt score  attempt score 
1 student1   90  6   20  66 
2 student2   27 19   85  42 
3 student3   ...    0  0 
4 student4   ...     ... 
5 student5   ...     ...

有70個學習目標，因此複製和粘貼嘗試和分數將會很繁瑣，所以我想知道是否有更好的方法來清理數據。

R：我試圖用R中的melt函數來獲取新數據，但它不能正常工作。有些學生缺少分數，學生姓名沒有列出，例如student3在這種情況下，所以我不能只是cbind的分數。

Excel中：有70個學習目標，而且由於缺少名字，我要檢查所有70個目標的所有相應的行爲VLOOKUP：

(=VLOOKUP($C7,'0learning.csv'!$B$372:$G$395,5,0) 
(=VLOOKUP($C7,'0learning.csv'!$B$372:$G$395,6,0)

有沒有更好的辦法？

來源

2015-06-16 SongTianyang

我們可以使用data.table的開發版本，即v1.9.5，它可以採用多個value.var列，並將'long'格式重新設置爲'wide'。安裝說明是here。

library(data.table)#v1.9.5+ 
names(data)[2] <- 'objective' 
dcast(setDT(data), student~objective, value.var=c('attempt', 'score')) 
# student attempt_objective1 attempt_objective2 score_objective1 
#1: student1     90     20    6 
#2: student2     27     85    19 
#3: student3     37     96    16 
#4: student4     56     61    60 
#5: student5     88     58    34 
# score_objective2 
#1:    66 
#2:    42 
#3:    87 
#4:    82 
#5:    31

或者用reshape從base R

reshape(data, idvar='student', timevar='objective', direction='wide') 
# student attempt.objective1 score.objective1 attempt.objective2 
# 1 student1     90    6     20 
# 2 student2     27    19     85 
# 3 student3     37    16     96 
# 4 student4     56    60     61 
# 5 student5     88    34     58 
# score.objective2 
# 1    66 
# 2    42 
# 3    87 
# 4    82 
# 5    31

來源

2015-06-16 16:53:49 akrun

謝謝，但它似乎有錯誤兩個碼的兩行的。1：>名稱（數據）[2] < - 'objective' 警告消息：... 2：> dcast（setDT（data），student_objective，value.var = c（'attempt'，'score'）） .subset2（x，i，精確=精確）：下標越界 – SongTianyang

@SongTianyang您使用'data.table'的開發版嗎？ – akrun

@SongTianyang我加了一個'base R'版本，如果你沒有devel版本的data.table – akrun

如何在R或Excel中重塑數據框？

回答

相關問題