我試圖在Java中創建Dataset
,所以我寫了下面的代碼:如何從自定義類Person創建數據集?
public Dataset createDataset(){
List<Person> list = new ArrayList<>();
list.add(new Person("name", 10, 10.0));
Dataset<Person> dateset = sqlContext.createDataset(list, Encoders.bean(Person.class));
return dataset;
}
Person
類是一個內部類。然而
星火拋出以下異常:
org.apache.spark.sql.AnalysisException: Unable to generate an encoder for inner class `....` without access to the scope that this class was defined in. Try moving this class out of its parent class.;
at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$2.applyOrElse(ExpressionEncoder.scala:264)
at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$2.applyOrElse(ExpressionEncoder.scala:260)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:243)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:243)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:53)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:242)
如何做正確?
使用[火花筆記本](http://spark-notebook.io)與scala 0.11的確,在case類定義之後並在dataframe命令中使用它之前,添加'org.apache.spark.sql.catalyst.encoders.OuterScopes.addOuterScope(this)'解決了這個問題。 –
我在問addOuterScope方法,如果你知道爲什麼必須添加編碼器才能正常工作 – eliasah
非常感謝您的更新。我曾問過你,因爲我在http://stackoverflow.com/a/40232936/3415409之前正在研究這個問題 – eliasah