想想看,我有以下的數據,我很喜歡摺疊一個data.frame到data.frame - 與問題()和骨料()
landlines <- data.frame(
year=rep(c(1990,1995,2000,2005,2010),times=3),
country=rep(c("US", "Brazil", "Asia"), each=5),
pct = c(0.99, 0.99, 0.98, 0.05, 0.9,
0.4, 0.5, 0.55, 0.5, 0.45,
0.7, 0.85, 0.9, 0.85, 0.75)
)
someStats <- function(x)
{
dp <- as.matrix(x$pct)-mean(x$pct)
indp <- as.matrix(x$year)-mean(x$year)
f <- lm.fit(indp,dp)$coefficients
w <- sd(x$pct)
m <- min(x$pct)
results <- c(f,w,m)
names(results) <- c("coef","sdev", "minPct")
results
}
我可以申請該函數返回彙總統計作用到的數據子集成功是這樣的:
> someStats(landlines[landlines$country=="US",])
coef sdev minPct
-0.022400 0.410938 0.050000
或看崩潰的國家是這樣的:
> by(landlines, list(country=landlines$country), someStats)
country: Asia
coef sdev minPct
0.00200000 0.08215838 0.70000000
---------------------------------------------------------------------------------------
country: Brazil
coef sdev minPct
0.00200000 0.05700877 0.40000000
---------------------------------------------------------------------------------------
country: US
coef sdev miPct
-0.022400 0.410938 0.050000
麻煩的是,這不是data.frame
對象,我需要進行進一步的處理,也不會投這樣:「沒問題」
> as.data.frame(by(landlines, list(country=landlines$country), someStats))
Error in as.data.frame.default(by(landlines, list(country = landlines$country), :
cannot coerce class '"by"' into a data.frame
我想,既然類似aggregate()
功能確實返回data.frame
:
> aggregate(landlines$pct, by=list(country=landlines$country), min)
country x
1 Asia 0.70
2 Brazil 0.40
3 US 0.05
麻煩的是,它不與任意函數正常工作:
> aggregate(landlines, by=list(country=landlines$country), someStats)
Error in x$pct : $ operator is invalid for atomic vectors
我真正想要得到的是一個data.frame
對象與下列:
- 國家
- COEF
- 發展局局長
- minPct
我怎麼能這樣做?
這很好,謝謝! – 2012-04-04 15:19:18