2013-07-01 57 views
0

我想基於包含屬於同一個實驗設計「組」的列索引的「設計」向量將數據框中的某些列應用於某組函數, (即重複)。我的觀察結果是行,我的採樣點是列。
設計矢量指定的列應基團一起:基於列索引對數據幀應用函數

designvector <- c(rep(1,2), rep(2,3), rep(3,3), rep(4,2), rep(5,2), rep(6,2), 
         rep(7,2), rep(8,2), rep(9,2)) 

數據幀的一個小例子,而我要應用功能是:

structure(list(`1` = c(4381L, 608L, 7648L, 458L, 350L, 203L), 
`1` = c(6450L, 1389L, 4896L, 526L, 920L, 352L), `2` = c(1966L, 
59L, 492L, 5291L, 1401L, 133L), `2` = c(6338L, 281L, 2649L, 
4718L, 1281L, 377L), `2` = c(12399L, 578L, 3094L, 1787L, 
1180L, 541L), `3` = c(9629L, 554L, 7299L, 2819L, 1314L, 497L 
), `3` = c(11329L, 709L, 3720L, 2909L, 1929L, 655L), `3` = c(11319L, 
535L, 5212L, 2191L, 1239L, 633L), `4` = c(7427L, 8637L, 894L, 
2L, 782L, 120L), `4` = c(6748L, 9139L, 431L, 28L, 871L, 224L 
), `5` = c(7125L, 11819L, 1728L, 9L, 607L, 313L), `5` = c(8651L, 
11022L, 442L, 96L, 728L, 249L), `6` = c(17879L, 3402L, 319L, 
6L, 1226L, 489L), `6` = c(20859L, 2648L, 463L, 10L, 1189L, 
408L), `7` = c(13457L, 1124L, 9386L, 18L, 635L, 367L), `7` = c(16292L, 
1732L, 6552L, 20L, 1022L, 431L), `8` = c(9035L, 5887L, 185L, 
11L, 550L, 1814L), `8` = c(14831L, 5833L, 570L, 8L, 1089L, 
1462L), `9` = c(22023L, 2254L, 5212L, 63L, 555L, 1254L), 
`9` = c(16887L, 2491L, 4949L, 68L, 921L, 983L)), .Names = c("1", 
"1", "2", "2", "2", "3", "3", "3", "4", "4", "5", "5", "6", "6", 
"7", "7", "8", "8", "9", "9"), row.names = c(NA, 6L), class = "data.frame") 

然而,使用ddply我得到一個錯誤,我真的不明白: ddply(abmat.sum,.(designvector),mean)給出了下面的輸出:

designvector V1 
1   1 NA 
2   2 NA 
3   3 NA 
4   4 NA 
5   5 NA 
6   6 NA 
7   7 NA 
8   8 NA 
9   9 NA 
Warning messages: 
1: In mean.default(piece, ...) : 
    argument is not numeric or logical: returning NA 
2: In mean.default(piece, ...) : 
    argument is not numeric or logical: returning NA 
3: In mean.default(piece, ...) : 
    argument is not numeric or logical: returning NA 
4: In mean.default(piece, ...) : 
    argument is not numeric or logical: returning NA 
5: In mean.default(piece, ...) : 
    argument is not numeric or logical: returning NA 
6: In mean.default(piece, ...) : 
    argument is not numeric or logical: returning NA 
7: In mean.default(piece, ...) : 
    argument is not numeric or logical: returning NA 
8: In mean.default(piece, ...) : 
    argument is not numeric or logical: returning NA 
9: In mean.default(piece, ...) : 
    argument is not numeric or logical: returning NA 

我無能爲力,我在這裏做錯了什麼。 任何建議使用ddply或其他方法,然後在數據幀上循環是受歡迎的。

+2

你'designvector'就是:'代表(1:9,C(2,3,3,2,2,2,2,2,2))' – Arun

+1

什麼是你真正想要做?分別對這兩列中的每一列取一個「平均值」? – Arun

+0

@Arun謝謝你使用'rep'的提示,我不知道這是可能的。另外,我只是試圖去考慮每一行的含義。不過,Richie Cotton的解決方案解決了我的問題。 –

回答

1

問題是abmat.sum是錯誤的形式(它是「寬」而不是「長」,如ddply要求)。使用melt來解決這個問題。

library(reshape2) 
abmat.sum_long <- melt(abmat.sum) 
abmat.sum_long$variable <- as.numeric(abmat.sum_long$variable) 

您還需要通過summariseddply

library(plyr) 
ddply(abmat.sum_long, .(variable), summarise, mean_value = mean(value)) 
+0

謝謝。事實證明,這正是我需要的答案。 –