2016-06-21 76 views
0

我在斯卡拉工作和火花火花階定義輸入參數的UDF

我定義一個UDF,這裏是

def udfcrpentcd = udf((CORP_ENT_CD:String)=>{ 
    state_name match{ 
     case "IL1" if state_name.contains("IL1")=> "IL1" 
     case "OK1" if state_name.contains("OK1")=> "OK1" 
     case "TX1" if state_name.contains("TX1")=> "TX1" 
     case "NM1" if state_name.contains("NM1")=> "NM1" 
     case "MT1" if state_name.contains("MT1")=> "MT1" 
     case _ =>"Null" 
    }}) 




val local_masterdb =old_dataframe_temp_masterdbDataFrame.withColumn("new_columna_name_CORP_ENT_CD",udfcrpentcd(old_dataframe_temp_masterdbDataFrame("last_column_of_old_dataframe_DB_STATUS")+1)) 
    local_masterdb.show() 

現在,我想重用上面的UDF,

我想使它通用,而不是比較state_name,我需要傳遞一個字符串,然後它返回CRP_ENT_CD ...這就是我想要做的。

這是正確的方式....

def udfcrpentcd (input_parameter:String) = udf((CORP_ENT_CD:String)=>{ 
    input_parameter match{ 
     case "IL1" if input_parameter.contains("IL1")=> "IL1" 
     case "OK1" if input_parameter.contains("OK1")=> "OK1" 
     case "TX1" if input_parameter.contains("TX1")=> "TX1" 
     case "NM1" if input_parameter.contains("NM1")=> "NM1" 
     case "MT1" if input_parameter.contains("MT1")=> "MT1" 
     case _ =>"Null" 
    }}) 

如果這是正確的方式,然後如何將它打回去? anyhelp關於傳遞參數

回答

1

下面是如何將參數傳遞給udf的示例。

val udfcrpentcd_res = udf(udfcrpentcd) 
def udfcrpentcd (String => String) = (input_parameter: String) =>{ 
input_parameter match{ 
    case "IL1" if input_parameter.contains("IL1")=> "IL1" 
    case "OK1" if input_parameter.contains("OK1")=> "OK1" 
    case "TX1" if input_parameter.contains("TX1")=> "TX1" 
    case "NM1" if input_parameter.contains("NM1")=> "NM1" 
    case "MT1" if input_parameter.contains("MT1")=> "MT1" 
    case _ =>"Null" 
}}) 

val local_masterdb = old_dataframe_temp_masterdbDataFrame.withColumn("new_columna_name_CORP_ENT_CD",udfcrpentcd_res(old_dataframe_temp_masterdbDataFrame("last_column_of_old_dataframe_DB_STATUS")+1)) 
local_masterdb.show()