2017-07-18 27 views
0

以下代碼引發了NullPointerException。即使有Option(x._1.F2).isDefined && Option(x._2.F2).isDefined來防止空值?如何在scala.math.BigDecimal上檢查null?

case class Cols (F1: String, F2: BigDecimal, F3: Int, F4: Date, ...) 

def readTable() : DataSet[Cols] = { 
    import sqlContext.sparkSession.implicits._ 

    sqlContext.read.format("jdbc").options(Map(
     "driver" -> "com.microsoft.sqlserver.jdbc.SQLServerDriver", 
     "url" -> jdbcSqlConn, 
     "dbtable" -> s"..." 
    )).load() 
     .select("F1", "F2", "F3", "F4") 
     .as[Cols] 
    } 

import org.apache.spark.sql.{functions => func} 
val j = readTable().joinWith(readTable(), func.lit(true)) 
readTable().filter(x => 
    (if (Option(x._1.F2).isDefined && Option(x._2.F2).isDefined 
     && (x._1.F2- x._2.F2< 1)) 1 else 0) //line 51 
    + ..... > 100) 

我試過!(x._1.F2== null || x._2.F2== null)它仍然得到異常。

唯一的例外是

 
java.lang.NullPointerException 
     at scala.math.BigDecimal.$minus(BigDecimal.scala:563) 
     at MappingPoint$$anonfun$compare$1.apply(MappingPoint.scala:51) 
     at MappingPoint$$anonfun$compare$1.apply(MappingPoint.scala:44) 
     at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) 
     at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) 
     at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) 
     at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:234) 
     at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:228) 
     at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827) 
     at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827) 
     at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) 
     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) 
     at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) 
     at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) 
     at org.apache.spark.scheduler.Task.run(Task.scala:108) 
     at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) 
     at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
     at java.lang.Thread.run(Unknown Source) 

更新: 我嘗試下面的表達式和執行仍然打線x._1.F2- x._2.F2。這是一種檢查BigDecimal是否爲空的方法嗎?

(if (!(Option(x._1.F2).isDefined && Option(x._2.F2).isDefined 
     && x._1.F2!= null && x._2.F2!= null)) 0 
     else (if (x._1.F2- x._2.F2< 1) 1 else 0)) 

更新2

後,我裹負進(math.abs((l.F2 - r.F2).toDouble)異常不見了。 爲什麼?

回答

0

嘗試添加該給你的if聲明:

&& (x._1.F2 && x._2.F2) != null

我已經在Java中的類似問題,這就是一直爲我工作。

+1

它得到'value &&的編譯器錯誤不是Option [BigDecimal]'的成員。 '&&'可以應用於'Option(...)'嗎? – ca9163d9

+0

nopes,它不能 – pedrofurla

+1

嘗試過'(x._1.F2 == null || x._2.F2 == null)',它仍然得到異常。 – ca9163d9

0

看着爲BigDecimal的源代碼,在網上563: https://github.com/scala/scala/blob/v2.11.8/src/library/scala/math/BigDecimal.scala#L563

它可能是x._1.F2.bigDecimalx._2.F2.bigDecimalnull,但我真的不知道怎麼會發生,給出了構造檢查那。但也許在那裏檢查null,看看是否解決了這個問題?

順便說一句,你真的應該避免所有的._1._2小號......你應該能夠做這樣的事情:

val (l: Cols, r: Cols) = x 

要提取的元組值。

+0

奇怪的是我檢查了它們是否爲空值,如果有空值,則不應該命中該行。 – ca9163d9

+1

猜猜看是什麼,在我將負號包裹到'(math.abs((l.F2 - r.F2).toDouble)')後,異常消失了。 – ca9163d9