如何根據R中的決策樹模型測試數據？

我使用r中的rpart軟件包從培訓數據構建了一個決策樹。現在我有更多的數據，並且我想根據樹來檢查它以檢查模型。邏輯上/迭代地，我想要做到以下幾點：如何根據R中的決策樹模型測試數據？

for each datapoint in new data 
    run point thru decision tree, branching as appropriate 
    examine how tree classifies the data point 
    determine if the datapoint is a true positive or false positive

我該如何做R？

來源

2013-10-27 bernie2436

使用'預測（）'函數：http://stat.ethz.ch/R-manual/R -devel/library/rpart/html/predict.rpart.html – David

爲了能夠使用它，我假設你將你的訓練集分成一個子集訓練集和一個測試集。

要創建可以使用的人才培養模式：

model <- rpart(y~., traindata, minbucket=5) # I suspect you did it so far.

將它應用到測試集：

pred <- predict(model, testdata)

然後你得到預測結果的向量。

在你的訓練測試數據集中，你也有「真實」的答案。假設訓練集中的最後一列。

簡單地等同他們將產生的結果是：

pred == testdata[ , last] # where 'last' equals the index of 'y'

當元素相等，你會得到一個真正的，當你得到一個FALSE，它意味着你的預測是錯誤的。

pred + testdata[, last] > 1 # gives TRUE positive, as it means both vectors are 1 
pred == testdata[, last] # gives those that are correct

這可能是有趣的，看看你有多少百分比有正確的：

mean(pred == testdata[ , last]) # here TRUE will count as a 1, and FALSE as 0

來源

2013-10-27 16:58:53 PascalVKooten

由於寫了這個答案，'rpart'庫大概已經改變了。我不得不用下面的方法使它工作：'pred < - 預測（model，newdata = testdata，type ='class'）'（否則你得到一個完整的概率矩陣）。 – kynan

如何根據R中的決策樹模型測試數據？

回答

相關問題