）。發生了什麼？

我在R（3.0.1）中試圖學習by()。這就是我正在做的。）。發生了什麼？

開放R
attach(iris)
head(iris)
by(iris[,1:4] , Species , mean)

這是我得到

> by(iris[,1:4] , Species , mean) 

Species: setosa 
[1] NA 
------------------------------------------------------------ 
Species: versicolor 
[1] NA 
------------------------------------------------------------ 
Species: virginica 
[1] NA 
Warning messages: 
1: In mean.default(data[x, , drop = FALSE], ...) : 
    argument is not numeric or logical: returning NA 

2: In mean.default(data[x, , drop = FALSE], ...) : 
    argument is not numeric or logical: returning NA 

3: In mean.default(data[x, , drop = FALSE], ...) : 
    argument is not numeric or logical: returning NA

來源

2014-01-13 lovekesh

@Momo'虹膜[ 1：4]'*不是一個因素。 'iris $ Species' *是一個因素，但這就是'INDICES'參數想要的（或者是其中一個選項）。 –

這裏的問題是，該功能你申請不適用於數據框。實際上要調用這樣的事情

R> mean(iris[iris$Species == "setosa", 1:4]) 
[1] NA 
Warning message: 
In mean.default(iris[iris$Species == "setosa", 1:4]) : 
    argument is not numeric or logical: returning NA

即要傳遞4列的數據幀，包含原始其中Species == "setosa"的行。

對於by()你需要做的變量此變量，如

R> by(iris[,1] , iris$Species , mean) 
iris$Species: setosa 
[1] 5.006 
------------------------------------------------------------ 
iris$Species: versicolor 
[1] 5.936 
------------------------------------------------------------ 
iris$Species: virginica 
[1] 6.588

或者使用colMeans()代替mean()所施加的FUN

R> by(iris[,1:4] , iris$Species , colMeans) 
iris$Species: setosa 
Sepal.Length Sepal.Width Petal.Length Petal.Width 
     5.006  3.428  1.462  0.246 
------------------------------------------------------------ 
iris$Species: versicolor 
Sepal.Length Sepal.Width Petal.Length Petal.Width 
     5.936  2.770  4.260  1.326 
------------------------------------------------------------ 
iris$Species: virginica 
Sepal.Length Sepal.Width Petal.Length Petal.Width 
     6.588  2.974  5.552  2.026

如果像colMeans()罐頭功能不存在，那麼你總是可以寫一個包裝，至sapply()例如

foo <- function(x, ...) sapply(x, mean, ...) 
by(iris[, 1:4], iris$Species, foo) 

R> by(iris[, 1:4], iris$Species, foo) 
iris$Species: setosa 
Sepal.Length Sepal.Width Petal.Length Petal.Width 
     5.006  3.428  1.462  0.246 
------------------------------------------------------------ 
iris$Species: versicolor 
Sepal.Length Sepal.Width Petal.Length Petal.Width 
     5.936  2.770  4.260  1.326 
------------------------------------------------------------ 
iris$Species: virginica 
Sepal.Length Sepal.Width Petal.Length Petal.Width 
     6.588  2.974  5.552  2.026

您可能會發現aggregate()更具吸引力：

R> with(iris, aggregate(iris[,1:4], list(Species = Species), FUN = mean)) 
    Species Sepal.Length Sepal.Width Petal.Length Petal.Width 
1  setosa  5.006  3.428  1.462  0.246 
2 versicolor  5.936  2.770  4.260  1.326 
3 virginica  6.588  2.974  5.552  2.026

通知我如何使用with()直接訪問Species;如果你不想通過iris$Species索引，這比attaching()iris好得多。

來源

2014-01-13 19:42:53

嘿謝謝你的回覆。但是我正在關注（）中的博客文章，並且正在遵循這些步驟。我做了同樣的事情，並不斷得到一個錯誤。在關於數據輸入的by（）文檔中，它表示「數據 - 一個R對象，通常是一個數據幀，可能是一個矩陣。」即使下面提供的例子似乎是應用我試圖應用於虹膜數據集的方式。我正在閱讀的博客是：http://nsaunders.wordpress.com/2010/08/20/a-brief-introduction-to-apply-in-r/你能告訴我爲什麼執行by（）在博客不給錯誤？ – lovekesh

@lovekesh'mean（）'在不久之前改變了，所以它不會**在數據框上工作 - 因爲這篇文章已經超過3年了，我不驚訝它不再有效。 'mean（）'（和'sd（）'）被改變以使語言更一致;很容易編寫'sapply（foo，mean）'來獲得舊的行爲（或者更快的是'colMeans（）'），並且有很多類似於'mean（）'和'sd（）'的函數， *沒有**在數據框上工作（他們需要應用）。 –

感謝加文的信息和答覆。 :) – lovekesh

這是另一個解決方案，它結合了「分裂」和「sapply」。結果是相同的，但轉置。當顯示許多變量的統計數據時，這可能是可取的，因爲它們是垂直列出的。

sapply（分裂（光圈，光圈[5]），函數（X）colMeans（X [，C（1：4）]））

   setosa versicolor virginica 
    Sepal.Length 5.006  5.936  6.588 
    Sepal.Width 3.428  2.770  2.974 
    Petal.Length 1.462  4.260  5.552 
    Petal.Width 0.246  1.326  2.026

來源

2014-11-26 12:13:00 omd

）。發生了什麼？

回答

相關問題