2013-08-22 84 views
1

我有一個.csv文件是這樣的:如何重塑數據長格式?

+-------+---------+------+-------+ 
| CONN | TABLE | COLS | OWNER | 
+-------+---------+------+-------+ 
| ONE | TABLE_A | 10 | MIKE | 
| ONE | TABLE_B | 9 | MIKE | 
| ONE | TAB_A | 11 | KIM | 
| ONE | TAB_B | 14 | KIM | 
| TWO | TABLE_A | 9 | MIKE | 
| TWO | TABLE_B | 9 | MIKE | 
| TWO | TAB_A | 11 | KIM | 
| TWO | TAB_D | 56 | KIM | 
| THREE | TABLE_A | 9 | MIKE | 
| THREE | TABLE_C | 3 | MIKE | 
| THREE | TABLE_D | 11 | KIM | 
| THREE | TAB_A | 11 | KIM | 
+-------+---------+------+-------+ 

我想康恩和業主比較表及其同事。我怎樣才能重塑這個數據做出這種比較?我的數據是在這裏:

dat <- structure(list(CONN = c("ONE", "ONE", "ONE", "ONE", "TWO", "TWO", 
"TWO", "TWO", "THREE", "THREE", "THREE", "THREE"), TABLE = c("TABLE_A", 
"TABLE_B", "TAB_A", "TAB_B", "TABLE_A", "TABLE_B", "TAB_A", "TAB_D", 
"TABLE_A", "TABLE_C", "TABLE_D", "TAB_A"), COLS = c(10L, 9L, 
11L, 14L, 9L, 9L, 11L, 56L, 9L, 3L, 11L, 11L), OWNER = c("MIKE", 
"MIKE", "KIM", "KIM", "MIKE", "MIKE", "KIM", "KIM", "MIKE", "MIKE", 
"KIM", "KIM")), .Names = c("CONN", "TABLE", "COLS", "OWNER"), class = "data.frame", row.names = c(NA, 
-12L)) 

我想是這樣的:

reshape(dat, varying=c('TABLE', 'COLS'), v.names=C('CONN', 'OWNER'), direction='long') 
Error in C("CONN", "OWNER") : object not interpretable as a factor 
+0

你能告訴我們你想要什麼表? – statquant

+1

我通常會發現'reshape2'更直觀:在你想類似'dcast(CONN + OWNER〜表,數據= DAT,value.var = 「COLS」)'什麼? –

+0

感謝@VincentZoonekynd這並獲得成功: dcast(所有者+ CONN〜表,數據=克,value.var =「COLS」) – Matkrupp

回答

1

我通常發現reshape2包更直觀:只要把所需的行(分別列。)前~(RESP後)。

dcast(CONN + OWNER ~ TABLE, data = dat, value.var="COLS") 
# CONN OWNER TAB_A TAB_B TAB_D TABLE_A TABLE_B TABLE_C TABLE_D 
# 1 ONE KIM 11 14 NA  NA  NA  NA  NA 
# 2 ONE MIKE NA NA NA  10  9  NA  NA 
# 3 THREE KIM 11 NA NA  NA  NA  NA  11 
# 4 THREE MIKE NA NA NA  9  NA  3  NA 
# 5 TWO KIM 11 NA 56  NA  NA  NA  NA 
# 6 TWO MIKE NA NA NA  9  9  NA  NA 
4

所有這些CAPITAL列名已經對shift鍵,你有點粘 - 你做C('CONN','OWNER')c工作原理是這樣小寫:

> reshape(dat, varying=c('TABLE', 'COLS'), v.names=c('CONN', 'OWNER'), direction='long') 
     CONN OWNER time id 
1 TABLE_A 10 1 1 
2 TABLE_B  9 1 2 
3 TAB_A 11 1 3 
4 TAB_B 14 1 4 
5 TABLE_A  9 1 5 
6 TABLE_B  9 1 6 
7 TAB_A 11 1 7 
8 TAB_D 56 1 8 
9 TABLE_A  9 1 9 
10 TABLE_C  3 1 10 
11 TABLE_D 11 1 11 
12 TAB_A 11 1 12 
+0

感謝您指出了這一點。 – Matkrupp

0

也可以使用來自tidyr包的gather()函數。

Function:  gather(data, key, value, ..., na.rm = FALSE, convert = FALSE) 
Same as:  data %>% gather(key, value, ..., na.rm = FALSE, convert = FALSE) 

Arguments: 
    data:   data frame 
    key:   column name representing new variable 
    value:   column name representing variable values 
    ...:   names of columns to gather (or not gather) 
    na.rm:   option to remove observations with missing values (represented by NAs) 
    convert:  if TRUE will automatically convert values to logical, integer, numeric, complex or 
        factor as appropriate