2013-12-19 33 views
0

總結日期數據我想下面的示例數據聚合成一個新的數據幀,如下所示:如何通過基團中的R

人口,樣本大小(N),完成的百分比(%)

樣品大小是每個人口的所有記錄的計數。我可以使用table命令或tapply來做到這一點。完成百分比是與「結束日期的(不包括所有記錄‘終止日期’被假定爲不完整記錄的百分比,這是我輸了!

樣本數據

sample <- structure(list(Population = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 
    2L, 2L, 3L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 
    1L, 2L, 2L, 3L, 3L, 3L, 3L, 1L, 1L, 3L, 3L, 3L, 3L), .Label = c("Glommen", 
    "Kaseberga", "Steninge"), class = "factor"), Start_Date = structure(c(16032, 
    16032, 16032, 16032, 16032, 16036, 16036, 16036, 16037, 16038, 
    16038, 16039, 16039, 16039, 16039, 16039, 16039, 16041, 16041, 
    16041, 16041, 16041, 16041, 16044, 16044, 16045, 16045, 16045, 
    16045, 16048, 16048, 16048, 16048, 16048, 16048), class = "Date"), 
     End_Date = structure(c(NA, 16037, NA, NA, 16036, 16043, 16040, 
     16041, 16042, 16042, 16042, 16043, 16043, 16043, 16043, 16043, 
     16043, 16045, 16045, 16045, 16045, 16045, NA, 16048, 16048, 
     16049, 16049, NA, NA, 16052, 16052, 16052, 16052, 16052, 
     16052), class = "Date")), .Names = c("Population", "Start_Date", 
    "End_Date"), row.names = c(NA, 35L), class = "data.frame") 

回答

2

你可以做這種分流/應用/組合:

spl = split(sample, sample$Population) 
new.rows = lapply(spl, function(x) data.frame(Population=x$Population[1], 
               SampleSize=nrow(x), 
               PctComplete=sum(!is.na(x$End_Date))/nrow(x))) 
combined = do.call(rbind, new.rows) 
combined 

#   Population SampleSize PctComplete 
# Glommen  Glommen   13 0.6923077 
# Kaseberga Kaseberga   7 1.0000000 
# Steninge Steninge   15 0.8666667 

一個提醒一句:sample是基函數的名稱,所以你應該選擇適合您的數據幀不同的名稱

+0

對不起數據框名稱。我試圖保持簡單。我很欣賞使用基本功能的解決方案。我有一個更復雜的問題,你的解決方案幫我弄明白了。 –

2

這很容易與plyr包:

library(plyr) 
ddply(sample, .(Population), summarize, 
     Sample_Size = length(End_Date), 
     Percent_Completed = mean(!is.na(End_Date)) * 100) 

# Population Sample_Size Percent_Completed 
# 1 Glommen   13   69.23077 
# 2 Kaseberga   7   100.00000 
# 3 Steninge   15   86.66667 
+0

這是一個非常好的解決方案。我只投了拆分/應用/合併解決方案,因爲我喜歡用基礎包學習R。謝謝! –