這是改編自@奔的問題,從@apply a function over groups of columns TylerRinker的答案的例子。它應該能夠通過列的間隔在矩陣或數據幀上應用任何函數。
# Create sample data for reproducible example
n <- 1000
set.seed(1234)
x <- matrix(runif(30 * n), ncol = n)
# Function to apply 'fun' to object 'x' over every 'by' columns
# Alternatively, 'by' may be a vector of groups
byapply <- function(x, by, fun, ...)
{
# Create index list
if (length(by) == 1)
{
nc <- ncol(x)
split.index <- rep(1:ceiling(nc/by), each = by, length.out = nc)
} else # 'by' is a vector of groups
{
nc <- length(by)
split.index <- by
}
index.list <- split(seq(from = 1, to = nc), split.index)
# Pass index list to fun using sapply() and return object
sapply(index.list, function(i)
{
do.call(fun, list(x[, i], ...))
})
}
# Run function
y <- byapply(x, 16, rowMeans)
# Test to make sure it returns expected result
y.test <- rowMeans(x[, 17:32])
all.equal(y[, 2], y.test)
# TRUE
你可以用它做其他奇怪的事情。例如,如果你需要知道每10列的總和,是一定要刪除NA
■如果存在:
y.sums <- byapply(x, 10, sum, na.rm = T)
y.sums[1]
# 146.7756
sum(x[, 1:10], na.rm = T)
# 146.7756
或者找到標準偏差:
byapply(x, 10, apply, 1, sd)
更新
by
也可以指定爲一個組的向量:
byapply(x, rep(1:10, each = 10), rowMeans)
同意@Joran,我的問題,你鏈接到的答案應該很容易適應回答這個問題。 – Ben