0
我有RDD,我想循環它。我這樣做:在foreach循環中的Spark NullPointerException
pointsMap.foreach({ p =>
val pointsWithCoordinatesWithDistance = pointsMap.leftOuterJoin(xCoordinatesWithDistance)
pointsWithCoordinatesWithDistance.foreach(println)
println("---")
})
然而,NullPointerException異常是發生:
java.lang.NullPointerException
at org.apache.spark.rdd.RDD.<init>(RDD.scala:125)
at org.apache.spark.rdd.CoGroupedRDD.<init>(CoGroupedRDD.scala:69)
at org.apache.spark.rdd.PairRDDFunctions.cogroup(PairRDDFunctions.scala:651)
at org.apache.spark.rdd.PairRDDFunctions.leftOuterJoin(PairRDDFunctions.scala:483)
at org.apache.spark.rdd.PairRDDFunctions.leftOuterJoin(PairRDDFunctions.scala:555)
...
兩個pointsMap
和xCoordinatesWithDistance
的foreach之前被初始化,幷包含元素。不在foreach循環中leftOuterJoin
也適用。對於我的代碼的完整版本,請參閱https://github.com/timasjov/spark-learning/blob/master/src/DBSCAN.scala
因此,我不能在其他RDD函數(如foreah)內使用RDD函數(如join)?如果是,那我該如何重寫代碼? – Bob 2014-10-27 07:26:06
另外,「適當的RDD操作員」是什麼意思? – Bob 2014-10-27 07:30:05
您不應在RDD功能內使用RDD。我很困惑你爲什麼需要在'foreach'中放置'pointsMap.leftOuterJoin(xCoordinatesWithDistance)'? – zsxwing 2014-10-27 08:06:19