2017-10-08 53 views
0

數據片段取自mlogit軟件包(Game2),格式爲長格式以模仿我的情況。其中CH是提供給平臺等級,並享有一定是一個受訪者如何將長排序數據重塑爲寬數據格式

  age  hours  platform  ch  own  chid 
1  33  2.00  GameBoy  6  0  1   
2  33  2.00  GameCube  5  0  1 
3  33  2.00  PC    4  1  1 
4  33  2.00  PlayStation 1  1  1 
5  33  2.00  PSPortable  3  0  1 
6  33  2.00  Xbox   2  0  1   
7  19  3.25  GameBoy  6  0  2 
8  19  3.25  GameCube  5  0  2 
9  19  3.25  PC    1  1  2 
10  19  3.25   PlayStation 2  1  2 
11  19  3.25   PSPortable  3  0  2 
12  19  3.25   Xbox   4  0  2   
13  18  4.00   GameBoy  6  0  3   
14  18  4.00   GameCube  4  0  3 
15  18  4.00   PC    5  1  3   
16  18  4.00   PlayStation 1  1  3 
17  18  4.00   PSPortable  2  0  3 
18  18  4.00   Xbox   3  0  3 

我需要的,如下圖所示這個長期數據轉換成寬形式的ID。這是在mlogit包中。秩被保持(從第1列(即ch.Xbox)至第6欄(即ch.PC)。

ch.Xbox ch.PlayStation ch.PSPortable ch.GameCube ch.GameBoy ch.PC own.Xbox own.PlayStation own.PSPortable own.GameCube own.GameBoy own.PC age hours 
1 2  1    3    5   6   4  0  1    0    0   0   1  33 2.00 
2 4  2    3    5   6   1  0  1    0    0   0   1  19 3.25 
3 3  1    2    4   6   5  0  1    0    0   0   1  18 4.00 

我的問題是,以長格式保留於上述作爲例子給出寬格式。

+1

的可能的複製[可在dcast的value.var是一個列表或具有多個值的變量?](https://stackoverflow.com/questions/23056328/can -the價值-VAR功能於dcast待一個列表,或具備的,多值變量) –

回答

2

我們可以使用dplyrtidyr進行重塑。

library(dplyr) 
library(tidyr) 

# Reshape the data  
dt2 <- dt %>% 
    gather(type, value, ch, own) %>% 
    unite("platform_type", type, platform, sep = ".") %>% 
    spread(platform_type, value) %>% 
    arrange(chid) 

如果你想最終的輸出是一樣的所需輸出,可以進一步準備列名的向量,並選擇基礎上的列那

# Prepare the column vector 
vec <- c("Xbox", "PlayStation", "PSPortable", "GameCube", "GameBoy", "PC") 
colname <- unlist(lapply(c("ch.", "own."), function(x) paste0(x, vec))) 
colname2 <- c(colname, "age", "hours") 

# Select columns 
dt3 <- dt2 %>% select(colname2) 

# View the result 
ch.Xbox ch.PlayStation ch.PSPortable ch.GameCube ch.GameBoy ch.PC own.Xbox own.PlayStation own.PSPortable own.GameCube own.GameBoy own.PC age hours 
1  2    1    3   5   6  4  0    1    0   0   0  1 33 2.00 
2  4    2    3   5   6  1  0    1    0   0   0  1 19 3.25 
3  3    1    2   4   6  5  0    1    0   0   0  1 18 4.00 

DATA

dt <- read.table(text = "   age  hours  platform  ch  own  chid 
1  33  2.00  GameBoy  6  0  1   
       2  33  2.00  GameCube  5  0  1 
       3  33  2.00  PC    4  1  1 
       4  33  2.00  PlayStation 1  1  1 
       5  33  2.00  PSPortable  3  0  1 
       6  33  2.00  Xbox   2  0  1   
       7  19  3.25  GameBoy  6  0  2 
       8  19  3.25  GameCube  5  0  2 
       9  19  3.25  PC    1  1  2 
       10  19  3.25   PlayStation 2  1  2 
       11  19  3.25   PSPortable  3  0  2 
       12  19  3.25   Xbox   4  0  2   
       13  18  4.00   GameBoy  6  0  3   
       14  18  4.00   GameCube  4  0  3 
       15  18  4.00   PC    5  1  3   
       16  18  4.00   PlayStation 1  1  3 
       17  18  4.00   PSPortable  2  0  3 
       18  18  4.00   Xbox   3  0  3", 
       header = TRUE, stringsAsFactors = FALSE)