2015-07-02 110 views
3

假設我有這個數據幀:的R - 數據幀操作

df <- data.frame(ID = c("id1", "id1", "id1", "id2", "id2", "id3", "id3", "id3"), 
    Code = c("A", "B", "C", "A", "B", "A", "C", "D"), 
    Count = c(34,65,21,3,8,12,15,16), Value = c(3,1,8,2,3,3,5,8)) 

,看起來像這樣:

df 
    ID Code Count Value 
1 id1 A 34  3 
2 id1 B 65  1 
3 id1 C 21  8 
4 id2 A  3  2 
5 id2 B  8  3 
6 id3 A 12  3 
7 id3 C 15  5 
8 id3 D 16  8 

我想獲得這樣的結果的數據幀:

result <- data.frame(Code = c("A", "B", "C", "D"), 
     id1_count = c(34,65,21,NA), id1_value = c(3,1,8,NA), 
     id2_count = c(3, 8, NA, NA), id2_value = c(2, 3, NA, NA), 
     id3_count = c(12,NA,15,16), id3_value = c(3,NA,5,8)) 

看起來像這樣:

> result 
    Code id1_count id1_value id2_count id2_value id3_count id3_value 
1 A  34   3   3   2  12   3 
2 B  65   1   8   3  NA  NA 
3 C  21   8  NA  NA  15   5 
4 D  NA  NA  NA  NA  16   8 

R基礎軟件包中是否有可以做到這一點的單線程?我能夠實現我需要的結果,但不能以R方式(即循環等)實現。任何幫助表示讚賞。謝謝。

回答

3

您可以嘗試dcast來自data.tablev1.9.5)的開發版,其中可能需要多個value.var列。說明安裝有here

library(data.table) 
dcast(setDT(df), Code~ID, value.var=c('Count', 'Value')) 
# Code Count_id1 Count_id2 Count_id3 Value_id1 Value_id2 Value_id3 
#1: A  34   3  12   3   2   3 
#2: B  65   8  NA   1   3  NA 
#3: C  21  NA  15   8  NA   5 
#4: D  NA  NA  16  NA  NA   8 

或者使用reshapebase R

reshape(df, idvar='Code', timevar='ID', direction='wide') 
# Code Count.id1 Value.id1 Count.id2 Value.id2 Count.id3 Value.id3 
#1 A  34   3   3   2  12   3 
#2 B  65   1   8   3  NA  NA 
#3 C  21   8  NA  NA  15   5 
#8 D  NA  NA  NA  NA  16   8 
+2

是。那就是訣竅。謝謝。沒有意識到這種重塑功能..試用xtabs和合並,但有點痛... – Marius

+1

akrun,也許你應該升級到1.9.5的最新版本。列名組合的順序現在改變了...... – Arun

+0

@Arun感謝您的信息。我升級到最新版本並編輯輸出。 – akrun

1

您也可以嘗試:

library(tidyr) 
library(dplyr) 

df %>% 
    gather(key, value, -(ID:Code)) %>% 
    unite(id_key, ID, key) %>% 
    spread(id_key, value) 

其中給出:

# Code id1_Count id1_Value id2_Count id2_Value id3_Count id3_Value 
#1 A  34   3   3   2  12   3 
#2 B  65   1   8   3  NA  NA 
#3 C  21   8  NA  NA  15   5 
#4 D  NA  NA  NA  NA  16   8