2017-06-12 89 views
0

我試圖在R中使用「surv.randomForestSRC」作爲機器學習的學習者。 我的代碼和結果如下。 「newHCC」是HCC患者的多個數字參數結果的生存數據。使用R和randomForestSRC包進行機器學習

> newHCC$status = (newHCC$status == 1) 
> surv.task = makeSurvTask(data = newHCC, target = c("time", "status")) 
> surv.task 
Supervised task: newHCC 
Type: surv 
Target: time,status 
Events: 61 
Observations: 127 
Features: 
numerics factors ordered 
     30  0  0 
Missings: FALSE 
Has weights: FALSE 
Has blocking: FALSE 

> lrn = makeLearner("surv.randomForestSRC") 
> rdesc = makeResampleDesc(method = "RepCV", folds=10, reps=10) 
> r = resample(learner = lrn, task = surv.task, resampling = rdesc) 
[Resample] repeated cross-validation iter 1: cindex.test.mean=0.485 
[Resample] repeated cross-validation iter 2: cindex.test.mean=0.556 
[Resample] repeated cross-validation iter 3: cindex.test.mean=0.825 
[Resample] repeated cross-validation iter 4: cindex.test.mean=0.81 
... 
[Resample] repeated cross-validation iter 100: cindex.test.mean=0.683 
[Resample] Aggr. Result: cindex.test.mean=0.688 

我有幾個問題。

  1. 如何檢查使用的ntree,mtry等參數?
  2. 有沒有什麼好的方法來調整?
  3. 我怎樣才能看到預測的個人風險,就像我們使用randomForestSRC包的predicted時所看到的一樣?

非常感謝提前。

+0

需要:[MCVE]並定義什麼是「調整」和「看預測個人風險」的含義。 –

+0

對不起,我的英語不好。我的意思是「調整」搜索ntree,mtry,節點大小等,以獲得更好的結果(更低的錯誤)。對於預測值,我正在考慮Rdocumentation(https://www.rdocumentation.org/packages/randomForestSRC/versions/2.4.1/topics/predict.rfsrc)中顯示的預測值。 –

回答

0
  1. 和2.您可以嘗試如下

    surv_param <- makeParamSet( makeIntegerParam("ntree",lower = 50, upper = 100), makeIntegerParam("mtry", lower = 1, upper = 6), makeIntegerParam("nodesize", lower = 10, upper = 50), makeIntegerParam("nsplit", lower = 3, upper = 50) ) rancontrol <- makeTuneControlRandom(maxit = 10L) surv_tune <- tuneParams(learner = lrn, resampling = rdesc, task = surv.task, par.set = surv_param, control = rancontrol) surv.tree <- setHyperPars(lrn, par.vals = surv_tune$x) surv <- mlr::train(surv.tree, surv.task) getLearnerModel(surva) model <- predict(surv, surv.task)

  2. 今天在MLR surv.randomForestSRC你不能預測個體風險。只有預測類型的反應