2016-05-04 29 views
0

我試圖使用MAPE作爲度量來評估模型的性能。在並行執行中使用自定義彙總函數的問題(插入符號)

在LOOCV和並行執行所有的情況下正常工作,但如果我用另一種方法重新取樣我得到這個錯誤:

Error in { : task 1 failed - 「could not find function 」mape「」

而是在串行執行這個問題消失。

下面的代碼提供了一個示例。

library(caret) 
    library(doParallel) 

    data("environmental") 

    registerDoParallel(makeCluster(detectCores(), outfile = '')) 



    mape <- function(y, yhat) mean(abs((y - yhat)/y)) 

    mapeSummary <- function (data, lev = NULL, model = NULL) { 

         out <- mape(data$obs, data$pred) 
         names(out) <- "MAPE" 

         out 
        } 



    #LOOCV - parallel 
    trControlLoocvPar <- trainControl(allowParallel = T, 
             verboseIter = T, 
             method = "LOOCV", 
             summaryFunction = mapeSummary) 

    #LOOCV - serial 
    trControlLoocvSer <- trainControl(allowParallel = F, 
             verboseIter = T, 
             method = "LOOCV", 
             summaryFunction = mapeSummary) 

    #Bootstrapping - parallel 
    trControlBootPar <- trainControl(allowParallel = T, 
             verboseIter = T, 
             method = "boot", 
             summaryFunction = mapeSummary) 

    #Bootstrapping - serial 
    trControlBootSer <- trainControl(allowParallel = F, 
             verboseIter = T, 
             method = "boot", 
             summaryFunction = mapeSummary) 


    trControlList <- list(trControlLoocvSer, 
          trControlLoocvPar, 
          trControlBootSer, 
          trControlBootPar) 


    models <- lapply(trControlList, 
        function(control) { 

         train(y = environmental$ozone, 
         x = environmental[, -1], 
         method = "glmnet", 
         trControl = control, 
         metric = "MAPE", 
         maximize = FALSE) 
        }) 

我的操作系統是El Capitan 10.11.4,插入符號版本是6.0.62。

回答

1

如消息所示,您的並行處理程序找不到mape函數。

最簡單的解決方案是將mape函數放入mapeSummary函數中,如下所示。然後你的並行進程將正常工作。

mapeSummary <- function (data, lev = NULL, model = NULL) { 
    mape <- function(y, yhat) mean(abs((y - yhat)/y)) 
    out <- mape(data$obs, data$pred) 
    names(out) <- "MAPE" 

    out 
} 

獎金:

您也可以使用clusterEvalQ功能的clusterApply功能之一。這適用於下面,但不是最優雅的解決方案,需要更多打字:

cl <- makePSOCKcluster(detectCores()-1) 
clusterEvalQ(cl, mape <- function(y, yhat) mean(abs((y - yhat)/y))) 
registerDoParallel(cl) 

mapeSummary <- function (data, lev = NULL, model = NULL) { 
    out <- mape(data$obs, data$pred) 
    names(out) <- "MAPE" 
    out 
} 

#Bootstrapping - parallel 
trControlBootPar <- trainControl(allowParallel = T, 
           verboseIter = T, 
           method = "boot", 
           summaryFunction = mapeSummary) 

train(y = environmental$ozone, 
     x = environmental[, -1], 
     method = "glmnet", 
     trControl = trControlBootPar, 
     metric = "MAPE", 
     maximize = FALSE) 

stopCluster(cl) 
registerDoSEQ()