R組，計算非NA值

我有了的散射數據幀NA的R組，計算非NA值

toy_df 
# Y X1 X2 Label 
# 5 3 3 A 
# 3 NA 2 B 
# 3 NA NA C 
# 2 NA 6 B

我想團這個由標籤字段，並計算非NA值有多少是在每個變量爲每個標籤。

desired output: 
# Label Y X1 X2 
# A  1 1 1 
# B  2 0 2 
# C  1 0 0

我已經使用循環，但它是緩慢和不整潔，我敢肯定有更好的方法。

總計似乎已經到了一半，但它包括了NA的數量。

aggregate(toy_df, list(toy_df$label), FUN=length)

讚賞任何想法...

我們可以使用data.table。轉換 'data.frame' 到 'data.table'（setDT(toy_df)），通過 '標籤'，循環通過Data.table（.SD）的所述子集進行分組，並獲得非NA值的sum（!is.na(x)）

library(data.table) 
setDT(toy_df)[, lapply(.SD, function(x) sum(!is.na(x))), by = Label] 
# Label Y X1 X2 
#1:  A 1 1 1 
#2:  B 2 0 2 
#3:  C 1 0 0

或者與dplyr使用同樣的方法

library(dplyr) 
toy_df %>% 
     group_by(Label) %>% 
     summarise_each(funs(sum(!is.na(.))))

或用by的一個base R選項d colSums通過在邏輯矩陣（!is.na(toy_df[-4])）

by(!is.na(toy_df[-4]), toy_df[4], FUN = colSums)

第4列或用rowsum用類似的方法整理爲by除了使用rowsum功能。

rowsum(+(!is.na(toy_df[-4])), group=toy_df[,4]) 
# Y X1 X2 
#A 1 1 1 
#B 2 0 2 
#C 1 0 0

2016-12-14 19:08:51 akrun

aggregate(cbind(toy_df$Y, toy_df$X1, toy_df$X2), list(toy_df$label), 
      FUN = function (x) sum(!is.na(x)))

2016-12-14 19:07:44

或者在基礎R

aggregate(toy_df[,1:3], by=list(toy_df$Label), FUN=function(x) { sum(!is.na(x))})

2016-12-14 19:11:03 G5W

回答