-1
我是scala新手。示例數據:比較火花地圖中的當前記錄和所有下一個值
1,"jack",34.5
2,"jackk",14.5
3,"jacky",24.5
4,"jack",64.5
And many more.
我想比較第一個記錄的每個字段與其他所有字段,然後第二個與所有其他字段等。 (請不要考慮Syntaxs) 我已經寫了下面的代碼:
val data = sc.parallalize(Seq((1,"jack",34.5),
(2,"jackk",14.5),
(3,"jacky",24.5),
(4,"jack",64.5))
val res = data.map{f =>
val rr = f._1.equals(f._1) //here same field compare with each other But I want to compare current record with all next records.
Row(rr)
}
例子:
"jack" with "jackk"
"jack" with "jacky"
"jack" with "jack"
"jackk" with "jacky"
"jackk" with "jack"
"jacky" with "jack"
我使用.map
因爲我想代碼應該在集羣上執行。
請給點建議。 在此先感謝。
你可以考慮模式匹配嗎? – 2017-03-09 11:39:17
請不要考慮模式匹配。 –