組合數據幀的行

我有一個數據框，帶有兩個Id變量和一個名稱變量。這些變量有各種不同數量的組合。組合數據幀的行

## dput'ed data.frame 
df <- structure(list(V1 = structure(c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 
4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L), .Label = c("A", 
"B", "C", "D", "E"), class = "factor"), V2 = c(1L, 2L, 3L, 1L, 
2L, 3L, 2L, 2L, 1L, 3L, 1L, 2L, 1L, 3L, 2L, 1L, 1L, 3L, 1L, 1L 
), V3 = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 2L, 2L, 1L, 3L, 1L, 
2L, 1L, 3L, 2L, 1L, 1L, 3L, 1L, 1L), .Label = c("test1", "test2", 
"test3"), class = "factor")), .Names = c("V1", "V2", "V3"), class = "data.frame", row.names = c(NA, 
-20L)) 
>df 
    V1 V2 V3 
1 A 1 test1 
2 B 2 test2 
3 C 3 test3 
4 D 1 test1 
5 E 2 test2 
6 A 3 test3 
7 B 2 test2 
8 C 2 test2 
9 D 1 test1 
10 E 3 test3 
11 A 1 test1 
12 B 2 test2 
13 C 1 test1 
14 D 3 test3 
15 E 2 test2 
16 A 1 test1 
17 B 1 test1 
18 C 3 test3 
19 D 1 test1 
20 E 1 test1

我想，這樣的結果具有每V1只有一個條目，然後值作爲第二和第三varaibles逗號分隔列表的行結合起來。像這樣：

f V2   V3 
1 A 1 ,3 ,1 ,1 test1 ,test3 ,test1 ,test1 
2 B 2 ,2 ,2 ,1 test2 ,test2 ,test2 ,test1 
3 C 3 ,2 ,1 ,3 test3 ,test2 ,test1 ,test3 
4 D 1 ,1 ,3 ,1 test1 ,test1 ,test3 ,test1 
5 E 2 ,3 ,2 ,1 test2 ,test3 ,test2 ,test1

我已經試過這與下面的代碼，這是好的，如果有點慢。任何關於更快解決方案的建議？

df = lapply(levels(df$V1), function(f){ 
    cbind(f, 
     paste(df$V2[df$V1==f],collapse=" ,"), 
     paste(df$V3[df$V1==f],collapse=" ,")) 
}) 
df = as.data.frame(do.call(rbind, df)) 
df

編輯：糾正dput（DF）

來源

2012-07-10 Davy Kavanagh

看起來你'dput 「編輯你想要的結果，而不是要轉換的數據。 – 2012-07-10 15:37:40

對不起。現在應該修復 – 2012-07-10 15:52:54

速度是你追求的唯一目標？通過將所有這些值合併爲單個字符串，您的輸出也會將數據限制在一定程度上。使用'聚合'避免了這一點;輸出中的每一列都是一個列表，您可以從中輕鬆恢復到早期的數據格式。 – A5C1D2H2I1M1N2O1R2T1 2012-07-10 16:37:05

確保V3（或其他因素的變量）在模式as.character和使用aggregate：

df$V3 = as.character(df$V3) 
aggregate(df[-1], by=list(df$V1), c, simplify=FALSE) 
# Group.1   V2       V3 
# 1  A 1, 3, 1, 1 test1, test3, test1, test1 
# 2  B 2, 2, 2, 1 test2, test2, test2, test1 
# 3  C 3, 2, 1, 3 test3, test2, test1, test3 
# 4  D 1, 1, 3, 1 test1, test1, test3, test1 
# 5  E 2, 3, 2, 1 test2, test3, test2, test1

來源

2012-07-10 15:52:25 A5C1D2H2I1M1N2O1R2T1

do.call("rbind", lapply(split(df[, 2:3], df[,1]), function(x) sapply(x, paste, collapse=","))) 
    V2  V3      
A "1,3,1,1" "test1,test3,test1,test1" 
B "2,2,2,1" "test2,test2,test2,test1" 
C "3,2,1,3" "test3,test2,test1,test3" 
D "1,1,3,1" "test1,test1,test3,test1" 
E "2,3,2,1" "test2,test3,test2,test1"

來源

2012-07-10 15:37:30 johannes

組合數據幀的行

回答

相關問題