2015-09-08 64 views
1

我想添加一列按兩列分組的值。我發現如何在一列上做到這一點,但無法弄清楚如何在兩列上做到這一點。 例如,如果我有以下的數據幀:基於兩列的數據框中的行總計

x=c("a","a", "b", "b","c", "c","a","a","b","b","c","c", "a", "a","b","b", "c", "c") 
y=c(1:18) 
q=c("M","M","M", "M","M","M","W","W","W","W","W","W","F","F","F","F","F","F") 
df<-data.frame(x,y,q) 

我想添加值跨越X和qÿ柱,使我有這樣一個

x=c("a","a", "b", "b","c", "c","a","a","b","b","c","c", "a", "a","b","b", "c", "c") 
y=c(3,7,11,15,19,23,27,31,35) 
q=c("M","M","M","W","W","W","F","F","F") 
d<-data.frame(x,y,q) 
+0

'集合體(Y〜X + Q,DF,總和)' – Jaap

+0

或者用'dplyr'包:'DF%>%GROUP_BY(X,Q) %>%彙總(ySum = sum(y))'。 – eipi10

+0

謝謝,兩者。我嘗試'聚合',它的工作。將嘗試第二個只是爲了好玩。 – Vasile

回答

4
新的數據幀

您有幾種選擇:

1:基礎R

aggregate(y~x+q, df, sum) 

2: data.table

library(data.table) 
setDT(df)[, .(sumy=sum(y)), by = .(x,q)] 

# when you want to summarise several columns: 
setDT(df)[, lapply(.SD, sum), by = .(x,q)] 

3: dplyr

library(dplyr) 
df %>% group_by(x,q) %>% summarise(sumy = sum(y)) 

# when you want to summarise several columns: 
df %>% group_by(x,q) %>% summarise_each(funs(sum)) 

所有應該給你同樣的結果(雖然不是以相同的順序)。例如,data.table輸出看起來像這樣:

x q y 
1: a M 3 
2: b M 7 
3: c M 11 
4: a W 15 
5: b W 19 
6: c W 23 
7: a F 27 
8: b F 31 
9: c F 35