2016-11-03 44 views
0

我怎樣才能變換數據X到Y在[R reshape2 dcast:轉換數據

X = data.frame(
    ID = c(1,1,1,2,2), 
    NAME = c("MIKE","MIKE","MIKE","LUCY","LUCY"), 
    SEX = c("MALE","MALE","MALE","FEMALE","FEMALE"), 
    TEST = c(1,2,3,1,2), 
    SCORE = c(70,80,90,65,75) 
) 

Y = data.frame(
    ID = c(1,2), 
    NAME = c("MIKE","LUCY"), 
    SEX = c("MALE","FEMALE"), 
    TEST_1 =c(70,65), 
    TEST_2 =c(80,75), 
    TEST_3 =c(90,NA) 
) 

reshape2dcast功能似乎工作,但它不能包括像ID,姓名,性別數據的其他列在上面的例子中。

假設ID列中的所有其他列都是一致的,就像Mike只能是ID爲1的男性一樣,我們該怎麼做?

+2

你嘗試了什麼?這似乎工作:'dcast(X,ID + NAME + SEX〜TEST,value.var =「SCORE」)' –

+0

或者使用庫(tidyr); spread(X,TEST,SCORE) – akrun

回答

1

根據文檔(?reshape2::dcast),dcast()允許式中...

「...」 表示式中未使用的所有其他變量...

reshape2data.table包都支持dcast()

所以,你可以寫:

reshape2::dcast(X, ... ~ TEST, value.var = "SCORE") 
# ID NAME SEX 1 2 3 
#1 1 MIKE MALE 70 80 90 
#2 2 LUCY FEMALE 65 75 NA 

但是,如果OP堅持認爲,列名應TEST_1TEST_2,等等,TEST列需要重塑之前進行修改。在這裏,data.table用於:

library(data.table) 
dcast(setDT(X)[, TEST := paste0("TEST_", TEST)], ... ~ TEST, value.var = "SCORE") 
# ID NAME SEX TEST_1 TEST_2 TEST_3 
#1: 1 MIKE MALE  70  80  90 
#2: 2 LUCY FEMALE  65  75  NA 

這與給出data.frame Y預期的應答線。