řddply行彙總統計

對於低於我數據幀的每行（由FID_Bounda，NAME定義，DESCRIPTIO & SOVEREIGNT）我試圖計算平均值，標準偏差和共同有效的變化在所有的值在每列以crN開頭。řddply行彙總統計

structure(list(FID_Bounda = 0:7, NAME = c("Bedfordshire", "Berkshire", 
"Bristol", "Buckinghamshire", "Cambridgeshire", "Cheshire", "Derbyshire", 
"Devon"), DESCRIPTIO = c("Ceremonial County", "Ceremonial County", 
"Ceremonial County", "Ceremonial County", "Ceremonial County", 
"Ceremonial County", "Ceremonial County", "Ceremonial County" 
), SOVEREIGNT = c("England", "England", "England", "England", 
"England", "England", "England", "England"), crN1 = c(61.944107636, 
38.769347117, 0.810167027, 63.721241962, 191.046323469, 81.467146994, 
61.65529268, 288.751788714), crN10 = c(60.33595964, 38.326639788, 
0.834289164, 63.009539538, 185.25772542, 82.936101454, 61.985178493, 
304.951827268), crN100 = c(53.385110882, 33.530058107, 0.739041324, 
55.601839364, 165.604271128, 76.386014559, 55.591194915, 284.739586188 
), crN1000 = c(58.397452282, 37.277298648, 0.820739862, 61.716749153, 
175.436497697, 82.461823706, 61.762203751, 321.414544333)), .Names = c("FID_Bounda", 
"NAME", "DESCRIPTIO", "SOVEREIGNT", "crN1", "crN10", "crN100", 
"crN1000"), row.names = c(NA, 8L), class = "data.frame")

我試圖用概述了cookbook-r代碼獲得這些值：

cdata <- ddply(uadt, c("FID_Bounda","NAME","DESCRIPTIO","SOVEREIGNT"), summarise, 
       N = length(grep("crN", names(uadt), value = T)), 
       mean = mean(grep("crN", names(uadt), value = F)), 
       sd = sd(grep("crN", names(uadt), value = F)), 
       se = sd/sqrt(N) 
) 
cdata

哪個正確計算crN列的全氮，但它給出了每一行相同的均值，SD和SE 。任何關於問題出在哪裏的幫助都將非常值得讚賞，因爲真正的數據集有1000列，所有列都有相同的命名模式crNnumber。

來源

2016-09-08 B.Wel

我知道這不是一個完美的答案，但它可能是值得使用更多最新的工具（同樣我知道這個陳述中的諷刺，因爲我的答案不使用tidyr）。不過，我會採取的做法是：

library(reshape2) 
madt <- melt(uadt, 
      id.vars = c("FID_Bounda", "NAME", 
         "DESCRIPTIO", "SOVEREIGNT")) 
library(dplyr) 
cdata <- summarise(group_by(madt, 
          FID_Bounda, NAME, 
          DESCRIPTIO, SOVEREIGNT), 
        N = n_distinct(variable), 
        mean = mean(value), 
        sd = sd(value), 
        se = sd/sqrt(N))

這不產生正確的輸出

來源

2016-09-08 09:02:15 NJBurgo

在食譜的例子是計算平均值和其他功能下不能跨越的行，列這是你想要的。

實現該使用基爲R的方法是：

functions <- list(length, mean, sd) 

d <- lapply(functions, function(y) { 
    apply(uadt, 1, function(x) y(as.numeric(x[5:8]))) 
}) 

calc <- as.data.frame(do.call(cbind, d)) 
names(calc) <- c("N", "mean", "sd") 

cdata <- cbind(uadt[1:4], calc) 
cdata$se <- cdata$sd/sqrt(cdata$N)

如果有多個數值列簡單地改變間隔5：8適當。

來源

2016-09-08 09:26:00 thepule

řddply行彙總統計

回答

相關問題