2016-04-14 97 views
2

我正在使用xgboost R包執行多類分類任務。 這是我創建的一段代碼來說明問題(輸入和輸出是隨機生成的,所以結果當然沒有意義,這是我玩過的並且學會如何處理包的東西):在xgboost中查看關注列表歷史記錄r包

require(xgboost) 
# First of all I set some parameters 
featureNumber = 5 
num_class = 4 
obs = 1000 

# I declare a function that I will use to generate my categorical labels 
generateLabels <- function(x,num_class){ 
    label <- 0 
    if(runif(1,min=0,max =1) <0.1){ 
     label <- 0 
    }else{ 
     label <- which.max(x) -1 
     foo <- runif(1,min=0,max =1) 
     if(foo > 0.9){label <- label + 1} 
     if(foo < 0.1){label <- label - 1} 
    } 
    return(max(min(label,num_class-1),0)) 
} 

# I generate a random train set and his labels 
features <- matrix(runif(featureNumber*obs, 1, 10), ncol = featureNumber) 
labels <- apply(features, 1, generateLabels,num_class = num_class) 
dTrain <- xgb.DMatrix(data = features, label = labels) 

# I generate a random test set and his labels 
testObs = floor(obs*0.25) 
featuresTest <- matrix(runif(featureNumber*testObs, 1, 10), ncol = featureNumber) 
labelsTest <- apply(featuresTest, 1, generateLabels, num_class = num_class) 
dTest <- xgb.DMatrix(data = featuresTest, label = labelsTest) 

# I train the 
xgbm <- xgb.train(data = dTrain, 
        nrounds = 10, 
        objective = "multi:softprob", 
        eval_metric = "mlogloss", 
        watchlist = list(train=dTrain, eval=dTest),       
        num_class = featureNumber) 

可正常工作,併產生預期的結果,這裏的幾行:

[0] train-mlogloss:1.221495 eval-mlogloss:1.292785 
[1] train-mlogloss:0.999905 eval-mlogloss:1.121077 
[2] train-mlogloss:0.846809 eval-mlogloss:1.014519 
[3] train-mlogloss:0.735182 eval-mlogloss:0.942461 
[4] train-mlogloss:0.650207 eval-mlogloss:0.891341 
[5] train-mlogloss:0.580136 eval-mlogloss:0.851774 
[6] train-mlogloss:0.524390 eval-mlogloss:0.827973 
[7] train-mlogloss:0.475884 eval-mlogloss:0.815081 
[8] train-mlogloss:0.435342 eval-mlogloss:0.799799 
[9] train-mlogloss:0.402307 eval-mlogloss:0.789209 

我無法實現的是存儲這些值以後使用它們。是否有可能做到這一點?調整參數會非常有幫助。

P.S.我知道我可以使用包中包含的交叉驗證方法xgb.cv來獲得類似的結果;但我寧願使用這種方法對發生的事情有更多的控制權,而且,由於這些指標是計算出來的,在我看來,浪費了計算能力,除了在屏幕上閱讀它之外,沒有可能使用它們。

回答

0

您可以訪問最圓的參數與xbgm$bestScorexbgm$bestInd

+0

我編輯的問題,其中包括一個完整的示例代碼,命令你建議返回null,我並沒有在文檔中找到他們。無論如何謝謝你試圖幫助。 – zenagian

相關問題