我目前正在處理MMST包中的wine數據。我已經整個數據集分成訓練和測試,並建立類似下面的代碼樹:如何統計樹中每個節點的觀測值
library("rpart")
library("gbm")
library("randomForest")
library("MMST")
data(wine)
aux <- c(1:178)
train_indis <- sample(aux, 142, replace = FALSE)
test_indis <- setdiff(aux, train_indis)
train <- wine[train_indis,]
test <- wine[test_indis,] #### divide the dataset into trainning and testing
model.control <- rpart.control(minsplit = 5, xval = 10, cp = 0)
fit_wine <- rpart(class ~ MalicAcid + Ash + AlcAsh + Mg + Phenols + Proa + Color + Hue + OD + Proline, data = train, method = "class", control = model.control)
windows()
plot(fit_wine,branch = 0.5, uniform = T, compress = T, main = "Full Tree: without pruning")
text(fit_wine, use.n = T, all = T, cex = .6)
而且我可以得到這樣一個形象:
每個節點下什麼數( Grignolino下的示例0/1/48)是什麼意思? 如果我想知道每個節點有多少訓練和測試樣本,我應該在代碼中寫些什麼?
感謝您的回答,我嘗試了'predict()'方法,其結果是Barabera,Barolo和Grignolino等一系列類別,有沒有辦法查看它們最終落入哪個節點,因爲有幾個節點代表相同的類別。 –
我這樣運行:result.'test_pred < - predict(fit_wine,test,type =「class」) test_pred'並且它會返回一系列類別 –
是的,'type =「class」'更好,但對於我們想要什麼,'type =「matrix」'似乎更有幫助。 – MattBagg