我想學習火花數據集(spark 2.0.1)。在左外部連接之下創建空指針異常。空指針異常 - Apache Spark數據集左外連接
case class Employee(name: String, age: Int, departmentId: Int, salary: Double)
case class Department(id: Int, depname: String)
case class Record(name: String, age: Int, salary: Double, departmentId: Int, departmentName: String)
val employeeDataSet = sc.parallelize(Seq(Employee("Jax", 22, 5, 100000.0),Employee("Max", 22, 1, 100000.0))).toDS()
val departmentDataSet = sc.parallelize(Seq(Department(1, "Engineering"), Department(2, "Marketing"))).toDS()
val averageSalaryDataset = employeeDataset.joinWith(departmentDataSet, $"departmentId" === $"id", "left_outer")
.map(record => Record(record._1.name, record._1.age, record._1.salary, record._1.departmentId , record._2.depname))
averageSalaryDataset.show()
16/12/14 16時48分26秒ERROR執行人:異常在任務0.0在階段2.0(TID 12) 顯示java.lang.NullPointerException
這是因爲,在做左外加入它爲record._2.depname提供空值。
如何處理?由於
雖然這可能會工作,它是一個非常貧窮的解決方案:O!我不明白爲什麼加入並不回饋的選項案例分類很容易檢查。 – Sparky