0
下面是我在做什麼:爲什麼repartitionAndSortWithinPartitions不能排序?
val rddkv = sc.parallelize(List(("k1",1),("k2",2),("k1",2),("k3",5),("k3",1)))
//rddkv.collect
//Array[(String, Int)] = Array((k1,1), (k2,2), (k1,2), (k3,5), (k3,1))
rddkv.repartitionAndSortWithinPartitions(new org.apache.spark.RangePartitioner(3,rddkv)).mapPartitionsWithIndex((i,iter_p) => iter_p.map(x=>" index="+i+" value="+x)).collect
//Array[String] = Array(" index=0 value=(k1,1)", " index=0 value=(k1,2)", " index=1 value=(k2,2)", " index=1 value=(k3,5)", " index=1 value=(k3,1)")
注意分區中的值進行排序。這是爲什麼?我錯過了什麼?