如何將不同的聚合函數應用於R中的不同列？

如何將不同的聚合函數應用於R中的不同列？該aggregate()功能只提供一個函數參數傳遞：如何將不同的聚合函數應用於R中的不同列？

V1 V2  V3 
1 18.45022 62.24411694 
2 90.34637 20.86505214 
1 50.77358 27.30074987 
2 52.95872 30.26189013 
1 61.36935 26.90993530 
2 49.31730 70.60387016 
1 43.64142 87.64433517 
2 36.19730 83.47232907 
1 91.51753 0.03056485 
... ...  ... 

> aggregate(sample,by=sample["V1"],FUN=sum) 
    V1 V1  V2  V3 
1 1 10 578.5299 489.5307 
2 2 20 575.2294 527.2222

如何申請一個不同功能，每列，即聚合V2與mean()功能和V2與sum()功能，無需調用aggregate()多次？

來源

2012-05-22 barbaz

這不是聚集 – mdsumner

@mdsumner與任何其他美麗的名字的其他功能意識到當然 – barbaz

對於這項任務，我會用ddply在plyr

> library(plyr) 
> ddply(sample, .(V1), summarize, V2 = sum(V2), V3 = mean(V3)) 
    V1  V2  V3 
1 1 578.5299 48.95307 
2 2 575.2294 52.72222

來源

2012-05-22 13:23:26 kohske

我，以及真的很喜歡plyr的簡單。在我瞭解了這個包在stackoverflow上的包後開始使用它。 – Alex

很好！這裏的「總結」論證有什麼魔力？編輯：得到的，這是實際功能應用其他參數稍後通過。 – barbaz

我們稱之爲數據幀x而非sample它已被使用。

編輯：

的by功能提供了比分裂更直接的路線/申請/合

by(x, list(x$V1), f)

：編輯

lapply(split(x, x$V1), myfunkyfunctionthatdoesadifferentthingforeachcolumn)

當然，這不是對每個單獨的功能列，但一個可以完成這兩項工作。

myfunkyfunctionthatdoesadifferentthingforeachcolumn = function(x) c(sum(x$V2), mean(x$V3))

方便的方式進行整理的結果是可能的，因爲這（但檢查出plyr包一個全面的解決方案，認爲這是學習的積極性更好的東西）。

matrix(unlist(lapply(split(x, x$V1), myfunkyfunctionthatdoesadifferentthingforeachcolumn)), ncol = 2, byrow = TRUE, dimnames = list(unique(x$V1), c("sum", "mean")))

來源

2012-05-22 13:24:31 mdsumner

很高興知道！我更喜歡繞過額外的中間功能實現，因此kohske建議的軟件包完全符合我的需求:) – barbaz

...或者在包裝中的相同名稱的功能data.table：

library(data.table) 

myDT <- data.table(sample) # As mdsumner suggested, this is not a great name 

myDT[, list(sumV2 = sum(V2), meanV3 = mean(V3)), by = V1] 

#  V1 sumV2 meanV3 
# [1,] 1 578.5299 48.95307 
# [2,] 2 575.2294 52.72222

來源

2012-05-22 13:40:22 BenBarnes

如何將不同的聚合函數應用於R中的不同列？

回答

相關問題