2017-07-16 43 views
0

我有以下簡單的程序,我不知道如何讀取在Scala中的數組中激發的值。從Scala中的WrappedArray檢索數據

val all_marks = Result.groupBy("class", "school").agg(collect_list("mark") as "marks",count("*") as "cnt").where($"cnt" > 10) 

var mrk=all_marks.collect().map(mark=>""+mark(2)) 

結果顯示如下:

mrk: Array[String] = Array(WrappedArray(52.0, 18.0, 17.0, 36.0, 22.0, 22.0), WrappedArray(49.0, 53.0, 41.0, 30.0, 48.0, 36.0)) 

我需要迭代(MRK)陣列讀取每個單獨WrappedArray,用於在每個WrappedArray每個標記進一步數學計算。如何以簡單的方式讀取每個WrappedArray。

+0

你嘗試使用'mrk.foreach'?另外,'.map(mark => mark(2).toString)'。 – philantrovert

+0

我試過(e < - mrk){val d = e.toDouble}。但發生錯誤「java.lang.NumberFormatException:對於輸入字符串:」WrappedArray「 –

+0

是的我試過.map(標記=>標記(2).toString),但此方法不會將標記更改爲 –

回答

0

您需要更換VAR(MRK)= all_marks.collect()。地圖(標記=> 「」 +記號(2))與

val mrk=all.select("marks") 

然後轉換數據幀到RDD(列表),然後回數據框中

toRDD=mrk.rdd.map(_.getList[Int](0).toList).toDF("marks") 

然後定義UDF

var i=0 
    var read_row_by_row="" 
//define udf 
    val createUdf = udf((list: Seq[Int]) => { 
     val ascending = list.sorted //sorts in ascending order 
//in this loop you can add whatever you like of calculations  
for (i <- 0 to ascending.size - 1){ 
     read_row_by_row=read_row_by_row+","+ascending(i) 
     } 

     s"${read_row_by_row}" 
    }) 
    val g =ag_two.withColumn("mark", createUdf($"marks")) 
    g.show 
+--------------------+ 
|    marks| 
+--------------------+ 
|,17,17,17,17,18,1...| 
|,18,18,18,18,19,1...| 
|,18,23,24,24,24,2...| 
|,18,23,24,24,24,2...| 
|,17,18,18,18,18,1...| 
|,25,35,36,39,41,4...| 
|,25,35,36,39,41,4...| 
|,31,31,33,33,33,3...|