2016-04-27 106 views
2

我有以下代碼。我們假設在600輪和450輪之後停止優化。哪個模型將用於預測 - 第450輪之後或第600輪之後?R xgboost預測與early.stop.round

watchlist <- list(val=dval,train=dtrain) 

param <- list( objective   = "binary:logistic", 
       booster    = "gbtree", 
       eval_metric   = "auc", 
       eta     = 0.02, 
       max_depth   = 7, 
       subsample   = 0.6, 
       colsample_bytree = 0.7 
) 

clf <- xgb.train( params    = param, 
        data    = dtrain, 
        nrounds    = 2000, 
        verbose    = 0, 
        early.stop.round = 150, 
        watchlist   = watchlist, 
        maximize   = TRUE 
) 

preds <- predict(clf, test) 

回答

3

經過一番研究,我找到了自己的答案。預測將在第600輪後使用模型。如果想使用效果最好的機型,應該使用preds <- predict(clf, test, ntreelimit=clf$bestInd)