2016-10-31 27 views
3

我正在嘗試使用插入符號庫調整xgboost的超參數,因爲在我的數據集中存在很多因素,而xgboost將數據視爲數值,所以我創建了一個虛擬使用功能散列行,但是當我運行插入符號的火車,我得到無法將xgb.DMatrix傳遞給插入符號

#Using Feature hashing to convert all the factor variables to dummies 
objTrain_hashed = hashed.model.matrix(~., data=train1[,-27], hash.size=2^15, transpose=FALSE) 
#created a dense matrix which is normally accepted by xgboost method in R 
#Hoping I could pass it caret as well 
dmodel <- xgb.DMatrix(objTrain_hashed[, ], label = train1$Walc) 

xgb_grid_1 = expand.grid(
    nrounds = 500, 
    max_depth = c(5, 10, 15), 
    eta = c(0.01, 0.001, 0.0001), 
    gamma = c(1, 2, 3), 
    colsample_bytree = c(0.4, 0.7, 1.0), 
    min_child_weight = c(0.5, 1, 1.5) 
) 


xgb_trcontrol_1 = trainControl(
    method = "cv", 
    number = 3, 
    verboseIter = TRUE, 
    returnData = FALSE, 
    returnResamp = "all",              # save losses across all models 
    classProbs = TRUE,               # set to TRUE for AUC to be computed 
    summaryFunction = twoClassSummary, 
    allowParallel = TRUE 
) 

xgb_train1 <- train(Walc ~.,dmodel,method = 'xgbTree',trControl = xgb_trcontrol_1, 
        metric = 'accuracy',tunegrid = xgb_grid_1) 

我收到以下錯誤

Error in as.data.frame.default(data) : 
    cannot coerce class ""xgb.DMatrix"" to a data.frame 

任何建議,我如何能進行錯誤?

回答

0

如何sparse.model.matrix(),而不是hashed.model.matrix ... 它工作在我的電腦... 上,並沒有轉化爲xgb.DMatrix() 把它放在火車( )函數僅僅是sparse.model.matrix()形式。

像...

model_data <- sparse.model.matrix(Y~., raw_data) 

xgb_train1 <- train(Y ~.,model_data, <bla bla> ...) 

希望它的作品...謝謝。