2015-12-21 123 views
1

我用xgboost做邏輯迴歸。我按照步驟from,但我有兩個問題。數據集被發現herexgboost問題與R

首先,當我運行後續代碼:

bst <- xgboost(data = sparse_matrix, label = output_vector,nrounds = 39,param) 

然後,我得到了

[0]train-rmse:0.350006 
[1]train-rmse:0.245008 
[2]train-rmse:0.171518 
[3]train-rmse:0.120065 
[4]train-rmse:0.084049 
[5]train-rmse:0.058835 
[6]train-rmse:0.041185 
[7]train-rmse:0.028830 
[8]train-rmse:0.020182 
[9]train-rmse:0.014128 
[10]train-rmse:0.009890 
[11]train-rmse:0.006923 
[12]train-rmse:0.004846 
[13]train-rmse:0.003392 
[14]train-rmse:0.002375 
[15]train-rmse:0.001662 
[16]train-rmse:0.001164 
[17]train-rmse:0.000815 
[18]train-rmse:0.000570 
[19]train-rmse:0.000399 
[20]train-rmse:0.000279 
[21]train-rmse:0.000196 
[22]train-rmse:0.000137 
[23]train-rmse:0.000096 
[24]train-rmse:0.000067 
[25]train-rmse:0.000047 
[26]train-rmse:0.000033 
[27]train-rmse:0.000023 
[28]train-rmse:0.000016 
[29]train-rmse:0.000011 
[30]train-rmse:0.000008 
[31]train-rmse:0.000006 
[32]train-rmse:0.000004 
[33]train-rmse:0.000003 
[34]train-rmse:0.000002 
[35]train-rmse:0.000001 
[36]train-rmse:0.000001 
[37]train-rmse:0.000001 
[38]train-rmse:0.000000 

train-rmse終於等於0!這是正常的嗎?通常,我知道train-rmse不能等於0.那麼,我的問題在哪裏?

其次,當我運行

importance <- xgb.importance([email protected][[2]], model = bst) 

然後,我得到了一個錯誤:

Error in eval(expr, envir, enclos) : object 'Yes' not found.

我不知道這是什麼意思,也許是第一個問題導致了第二個。

library(data.table) 
train_x<-fread("train_x.csv") 
str(train_x) 
train_y<-fread("train_y.csv") 
str(train_y) 
train<-merge(train_y,train_x,by="uid") 
train$uid<-NULL 
test<-fread("test_x.csv") 
require(xgboost) 
require(Matrix) 
sparse_matrix <- sparse.model.matrix(y~.-1, data = train) 
head(sparse_matrix) 
output_vector = train[,y] == "Marked" 
param <- list(objective = "binary:logistic", booster = "gblinear", 
      nthread = 2, alpha = 0.0001,max.depth = 4,eta=1,lambda = 1) 
bst <- xgboost(data = sparse_matrix, label = output_vector,nrounds = 39,param) 
importance <- xgb.importance([email protected][[2]], model = bst) 

回答

1

我遇到了同樣的問題(錯誤的eval(表達式,ENVIR,enclos):對象是「未找到),原因是以下幾點:

我試圖做

dt = data.table(x = runif(10), y = 1:10, z = 1:10) 
label = as.logical(dt$z) 
train = dt[, z := NULL] 
trainAsMatrix = as.matrix(train) 
label = as.matrix(label) 

bst <- xgboost(data = trainAsMatrix, label = label, max.depth = 8, 
       eta = 0.3, nthread = 2, nround = 50, objective = "reg:linear") 
bst$featureNames = names(train) 
xgb.importance(model = bst) 

的問題來自於線

label = as.logical(dt$z) 

我得到了這條線在那裏,因爲我用xgboost最後一次,我想預測一個分類變量。現在因爲我想回歸它應該看:

label = dt$z 

也許類似的事情導致您的情況下的問題?

1

也許這是有幫助的。當標籤有零變化時,我經常會遇到同樣的錯誤。使用當前已有點舊的xgboost的CRAN版本(0.4.4)。 xgb.train高興地接受這個(顯示一個.50 AUC),但是當調用xgb.importance時顯示錯誤。

乾杯

奧托

[0] train-auc:0.500000 validate-auc:0.500000 
[1] train-auc:0.500000 validate-auc:0.500000 
[2] train-auc:0.500000 validate-auc:0.500000 
[3] train-auc:0.500000 validate-auc:0.500000 
[4] train-auc:0.500000 validate-auc:0.500000 

[1] "XGB error: Error in eval(expr, envir, enclos): object 'Yes' not found\n" 
+0

我看到了同樣的錯誤,當我的列車AUC爲0.5000,因此我在預測無變化。 –