使用naiveBayes預測類變量

我只是試圖用naiveBayes函數在e1071包中。這裏是過程：使用naiveBayes預測類變量

>library(e1071) 
>data(iris) 
>head(iris, n=5) 
Sepal.Length Sepal.Width Petal.Length Petal.Width Species 
1   5.1   3.5   1.4   0.2 setosa 
2   4.9   3.0   1.4   0.2 setosa 
3   4.7   3.2   1.3   0.2 setosa 
4   4.6   3.1   1.5   0.2 setosa 
5   5.0   3.6   1.4   0.2 setosa 
>model <-naiveBayes(Species~., data = iris) 
> pred <- predict(model, newdata = iris, type = 'raw') 
> head(pred, n=5) 
     setosa versicolor virginica 
[1,]  1.00000 2.981309e-18 2.152373e-25 
[2,]  1.00000 3.169312e-17 6.938030e-25 
[3,]  1.00000 2.367113e-18 7.240956e-26 
[4,]  1.00000 3.069606e-17 8.690636e-25 
[5,]  1.00000 1.017337e-18 8.885794e-26

到目前爲止，一切都很好。在下一步中，我嘗試創建一個新的數據點，並使用naivebayes模型（model）預測類變量（Species），並選擇了其中一個訓練數據點。

> test = c(5.1, 3.5, 1.4, 0.2) 
> prob <- predict(model, newdata = test, type=('raw'))

這裏是結果：

> prob 
     setosa versicolor virginica 
[1,] 0.3333333 0.3333333 0.3333333 
[2,] 0.3333333 0.3333333 0.3333333 
[3,] 0.3333333 0.3333333 0.3333333 
[4,] 0.3333333 0.3333333 0.3333333

和奇怪。我用作test的數據點是iris數據集的行。根據實際數據，這個數據點的類變量是setosa：

Sepal.Length Sepal.Width Petal.Length Petal.Width Species 
1   5.1   3.5   1.4   0.2 setosa

和naiveBayes正確預測：

   setosa versicolor virginica 
    [1,]  1.00000 2.981309e-18 2.152373e-25

但是當我試圖預測test數據點，它返回不正確的結果。爲什麼當我正在尋找只有一個數據點的預測時，它會返回四行？我做錯了嗎？

來源

2015-06-29 MTT

您需要對應於您的培訓數據列名稱的列名稱。你的訓練數據

test2 = iris[1,1:4] 

predict(model, newdata = test2, type=('raw')) 
    setosa versicolor virginica 
[1,]  1 2.981309e-18 2.152373e-25

「新」測試數據data.frame

test1 = data.frame(Sepal.Length = 5.1, Sepal.Width = 3.5, Petal.Length = 1.4, Petal.Width = 0.2) 

predict(model, newdata = test1, type=('raw')) 
    setosa versicolor virginica 
[1,]  1 2.981309e-18 2.152373e-25

定義如果你只給它一個維度，那麼它可以通過貝葉斯法則預測。

predict(model, newdata = data.frame(Sepal.Width = 3), type=('raw')) 

     setosa versicolor virginica 
[1,] 0.2014921 0.3519619 0.446546

如果您在訓練數據中找不到維度，則會得到相同的可能類。輸入更長的矢量只會給你更多的預測。

predict(model, newdata = 1, type=('raw')) 

     setosa versicolor virginica 
[1,] 0.3333333 0.3333333 0.3333333

來源

2015-06-29 15:57:04 Vlo

完美的解決方案！非常感謝。 – MTT

使用naiveBayes預測類變量

回答

相關問題