2016-09-11 43 views
0

基本上我試圖自動化評分建模工作流程,並遇到輸入從循環產生的結果從smbinning()的問題,因此記錄在名單。結果本身就是一個列表,所以我列出了一堆列表。當我嘗試將結果(連續變量的存儲區)添加到數據框中時出現問題。我發現無法提供進入列表級別所需的語法。我嘗試通過引用列號來解決此問題,並試圖從循環中傳遞相應的列表名稱。我得到的錯誤是:在循環中訪問與smbinning.gen()列表中的列表

[.data.frame(df,,col_id)中的錯誤:選擇了未定義的列。

我的代碼如下:

colcnt <- ncol(e_mod) 
bucket_resultlist <- list() 
for (i in 2:colcnt) { 
    #curvar = paste0('z', i) 
    curresult = smbinning(df = e_mod, y = "Bankrupt", x = colnames(e_mod)[i], p = 0.05) 
    bucket_resultlist[[paste0('Bin_Result_', colnames(e_mod)[i])]] = curresult #paste0('binresult', colnames(e)[i]) = curresult 
} 

e_mod2 = e_mod 

for (i in 1:length(bucket_resultlist_trunc)) { 
e_mod2 = smbinning.genCUSTOM(e_mod, bucket_resultlist_trunc[[i]] , chrname = i) 
} 

我甚至試圖定義客戶版本smbinning.gen()功能,考慮到這一點,在標準的形式,它只是試圖串連$ivtable到列表引用,但我需要能夠從此生成的列表中跳過一個級別,然後爲該列表中的每個相應列表運行smbinning.gen()。這裏是自定義代碼和原定義註釋:

smbinning.genCUSTOM = function(df, ivout, chrname = "NewChar") { 
    df = cbind(df, tmpname = NA) 
    ncol = ncol(df) 
    col_id = paste0(ivout, '[[6]]', collapse = NULL) # Original: ivout$col_id 
    # Updated 20160130 
    b = paste0(ivout, '[[4]]', collapse = NULL) # Original: ivout$bands 
    df[, ncol][is.na(df[, col_id])] = 0 # Missing 
    df[, ncol][df[, col_id] <= b[2]] = 1 # First valid 
    # Loop goes from 2 to length(b)-2 if more than 1 cutpoint 
    if (length(b) > 3) { 
     for (i in 2:(length(b) - 2)) { 
      df[, ncol][df[, col_id] > b[i] & df[, col_id] <= b[i + 1]] = i 
     } 
    } 
    df[, ncol][df[, col_id] > b[length(b) - 1]] = length(b) - 1 # Last 
    df[, ncol] = as.factor(df[, ncol]) # Convert to factor for modeling 
    blab = c(paste("01 <=", b[2])) 
    if (length(b) > 3) { 
     for (i in 3:(length(b) - 1)) { 
      blab = c(blab, paste(sprintf("%02d", i - 1), "<=", b[i])) 
     } 
    } else { i = 2 } 
    blab = c(blab, paste(sprintf("%02d", i), ">", b[length(b) - 1])) 

    # Are there ANY missing values 
    # any(is.na(df[,col_id])) 

    if (any(is.na(df[, col_id]))) { 
     blab = c("00 Miss", blab) 
    } 
    df[, ncol] = factor(df[, ncol], labels = blab) 

    names(df)[names(df) == "tmpname"] = chrname 
    return(df) 
} 

所有幫助非常感謝!

這裏的表結構 http://i.stack.imgur.com/iYau2.png

這也張貼在數據科學部分,但整個今天

+0

我認爲,問題的關鍵在於最有可能正確地傳遞參數進入'smbinning.genCUSTOM()'函數 –

回答

0

感謝#1的爲是我的黃色橡皮鴨在這個只有5次。此修復程序是更改傳入參數的方法:

smbinning.genCUSTOM = function(df, ivout, chrname = "NewChar") { df = cbind(df, tmpname = NA) ncol = ncol(df) col_id = ivout[[6]] # paste0(ivout, '[[6]]', collapse = NULL) # Original: ivout$col_id # Updated 20160130 b = ivout[[4]] # paste0(ivout, '[[4]]', collapse = NULL) # Original: ivout$bands

並提及新的DF e_mod2,而不是e_mod for (i in 1:length(bucket_resultlist_trunc)) { e_mod2 = smbinning.genCUSTOM(e_mod2, bucket_resultlist_trunc[[i]] , chrname = i) }