0
我首次使用R。我有以下的數據集(即我其實跟工作一個非常大的數據集的樣機):獲取R中的數據計數
Type Date Size Color
L shape 2008-04-14 161 blue
L shape 2010-10-16 654 yellow
L shape 2005-07-03 149 blue
L shape 2006-08-16 657 yellow
L shape 2007-04-08 229 yellow
L shape 2004-03-17 784 green
Y shape 2014-02-22 917 pink
Y shape 2012-05-04 186 green
Y shape 2006-11-25 641 yellow
Y shape 2015-09-07 493 blue
Y shape 2011-07-06 953 green
我想找回每種顏色的occurrances的數量爲每個類型,日期爲每種類型以及每種類型的尺寸的最小值,最大值和平均值。輸出應該是這樣的:
Type Colors Dates Mean Size Min Size Max Size
L shape 3 2008-04-14 439 149 784
2010-10-16
2005-07-03
2006-08-16
2007-04-08
2004-03-17
Y shape 4 2014-02-22 638 186 953
2012-05-04
2006-11-25
2015-09-07
2011-07-06
這是我迄今所做的:
cat <- big_catalog
na.rm = TRUE
library(plyr)
mydata <-ddply(cat, c("Type", "Date", "Size", "Color"), summarize,
Colors = length(Color),
Dates = (Date),
Mean_Size = mean(Size),
Minimum_Size = min(Size),
Maximum_Size = max(Size)
)
但我結束了這一點:
Type Date Size Color Colors Dates Mean Size Min Size Max Size
L shape 2008-04-14 161 blue 2 2008-04-14 161 161 161
L shape 2010-10-16 654 yellow 3 2010-10-16 654 654 654
L shape 2005-07-03 149 blue 2 2005-07-03 149 149 149
L shape 2006-08-16 657 yellow 3 2006-08-16 657 657 657
L shape 2007-04-08 229 yellow 2 2007-04-08 229 229 229
L shape 2004-03-17 784 green 1 2004-03-17 784 784 784
Y shape 2014-02-22 917 pink 1 2014-02-22 917 917 917
Y shape 2012-05-04 186 green 2 2012-05-04 186 186 186
Y shape 2006-11-25 641 yellow 1 2006-11-25 641 641 641
Y shape 2015-09-07 493 blue 1 2015-09-07 493 493 493
Y shape 2011-07-06 953 green 2 2011-07-06 953 953 953
我顯然需要循環這個,但我對R很新,我不知道該怎麼做。
由每列,只是組由'Type'列不羣。 (因爲你希望一切都按「每種類型」完成)。儘管你對'Date'的要求是多行的,其他所有行都是單行復雜的事情... – Gregor