2017-05-23 18 views

回答

0

你可以指望對象:

val rdd = sc.parallelize(Seq(("a", 2), ("a", 2), ("a", 3), ("a", 3), ("a", 4)) 

val counts = rdd.map((_, 1)).reduceByKey(_ + _) 

,要麼reduce

val min counts.reduce((x, y) => if (x._1._2 <= y._1._2) x else y) 

或使用min

import scala.math.Ordering 

val min = counts.min()(Ordering.by[((String, Int), Int), Int](_._1._2)) 

您可以選擇遵循此與複製步驟:

min match { 
    case (x, n) => Seq.fill(n)(x) 
} 

如果次數並不重要只需使用min直接:

rdd.min()(Ordering.by[(String, Int), Int](_._2)) 
0
case class Item(c: Char, i: Int) 
val items = Array[Item](new Item('a', 2), new Item('a', 2), new Item('a', 3), new Item('a', 3), new Item('a', 3), new Item('a', 4), new Item('a', 6), new Item('a', 5)) 
val rdd = sc.makeRDD(items) 
val minValue = rdd.map(_.i).min() 
val result = rdd.filter(item => item.i == minValue) 
相關問題