2016-12-01 62 views
2

我想根據此價格按升序對房屋進行排序。如何基於spark在java中的值對列表進行排序?

public class Home implements Serializable{ 

     private double price = Math.Random() * 1000; 

    } 

這就是我如何按順序做到這一點。

ArrayList<Home> city; // Assume it is initallized with some values 
Arrays.sort(this.city,new Comparator<House>(){ 
       public int compare (House o1, House o2){ 
       if (o1.getPrice() > o2.getPrice()) { 
        return -1; 
       } else if (o1.getPrice() < o2.getPrice()) { 
        return 1; 
       } 
       return 0; 
      } 
      }); 

現在我想使用Apache Spark Java進行排序。

方法一

JavaRDD<House> r2 = houseRDD.sortBy( i -> {return i.getPrice(); }, true, 1); 

方法二:

JavaRDD<House> r = populationRDD.sortBy(new Function<House, Double>() { 
       private static final long serialVersionUID = 1L; 

       @Override 
       public Double call(Individual value) throws Exception { 
        return value.getPrice(); 
       } 

      }, true, 1); 

哪些錯誤在上面的方法,我沒有得到以下例外 -

java.lang.ClassCastException:房子不能轉換爲java.lang.Comparable的

java.lang.ClassCastException: House cannot be cast to java.lang.Comparable 
    at org.spark_project.guava.collect.NaturalOrdering.compare(NaturalOrdering.java:28) 
    at scala.math.LowPriorityOrderingImplicits$$anon$7.compare(Ordering.scala:153) 
    at scala.math.Ordering$$anon$4.compare(Ordering.scala:111) 
    at org.apache.spark.util.collection.Utils$$anon$1.compare(Utils.scala:35) 
    at org.spark_project.guava.collect.Ordering.max(Ordering.java:551) 
    at org.spark_project.guava.collect.Ordering.leastOf(Ordering.java:667) 
    at org.apache.spark.util.collection.Utils$.takeOrdered(Utils.scala:37) 
    at org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$30.apply(RDD.scala:1393) 
    at org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$30.apply(RDD.scala:1390) 
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:785) 
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:785) 
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) 
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) 
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:283) 
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) 
    at org.apache.spark.scheduler.Task.run(Task.scala:86) 
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 

按照註釋的新主頁類

public class Home implements Serializable,Comparable<Home>{ 

     private double price = Math.Random() * 1000; 

@Override 
    public int compareTo(House o) { 
     return Double.compare(this.getPrice(),o.getPrice()); 
    } 
    } 
+0

從哪一行?方法一的 – shmosel

+0

我收到了這個異常。 – irobo

+0

從哪一行? – shmosel

回答

2
List<Home> homes; // initialize it with some data 
JavaRDD<Individual> homeRDD = SparkUtil.getSparkContext().parallelize(homes); 
    public class Home implements Serializable,Comparable<Home>{ 

     private double price = Math.Random() * 1000; 

     @Override 
     public int compareTo(Home o) { 
      return Double.compare(this.getPrice(),o.getPrice()); 
     } 
    } 

現在嘗試相同的代碼

JavaRDD<House> houseRDD = houseRDD.sortBy( i -> {return i.getPrice(); }, true, 1); 

houseRDD.top(4); // this will output top 4 houses 
相關問題