2016-05-19 49 views
-1

使用UDF意味着每個因子c1,c2,c3必須獨立傳遞參數。有沒有靈活的解決方案,例如如何將這些因子的序列傳遞給UDF如何將二維數組傳遞給用戶定義的函數?

val myFunction = udf { 
     (userBias: Float, productBias: Float, productBiases: Map[Long, Float], 
     userFactors: Seq[Float], productFactors: Seq[Float], c1: String, c2: String, c3: String) => 

     var result = Float.NaN 

     // result calculation 

     result 
    } 

然後我把這個功能通過以下方式(datasetDataFrame):

myFunction(userBias("bias"), 
      productBias("bias"), 
      productBias("biases"), 
      userFactors("features"), 
      productFactors("features"), 
      dataset(factors(0)), dataset(factors(1)), dataset(factors(2)) 

如果我做這樣的事情,那麼編譯器說 「不適用」:

val myFactors = dataset.select(factors.head, factors.tail: _*) 

myFunction(userBias("bias"), 
      productBias("bias"), 
      productBias("biases"), 
      userFactors("features"), 
      productFactors("features"), 
      myFactors) 
+0

爲什麼這個問題是downvoted? – Klue

回答

0

如果你有這樣的二維數組:

val xy = Array.ofDim[Int](numrows, numcolumns); 
isFunction(xy); 

爲isFunction你的函數定義是這樣的:

def isFunction(arg: Array[Array[Int]]){ 
    println(arg(i)(0)); // how to access array element