來自IBM藍色裝入混合對象存儲一個數據幀的計數()拋出時則InferSchema啓用了以下情況例外:計數拋出java.lang.NumberFormatException:空上從對象存儲器裝載有則InferSchema文件啓用
Name: org.apache.spark.SparkException
Message: Job aborted due to stage failure: Task 3 in stage 43.0 failed 10 times, most recent failure: Lost task 3.9 in stage 43.0 (TID 166, yp-spark-dal09-env5-0034): java.lang.NumberFormatException: null
at java.lang.Integer.parseInt(Integer.java:554)
at java.lang.Integer.parseInt(Integer.java:627)
at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272)
at scala.collection.immutable.StringOps.toInt(StringOps.scala:29)
at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$.castTo(CSVInferSchema.scala:241)
at org.apache.spark.sql.execution.datasources.csv.CSVRelation$$anonfun$csvParser$3.apply(CSVRelation.scala:116)
at org.apache.spark.sql.execution.datasources.csv.CSVRelation$$anonfun$csvParser$3.apply(CSVRelation.scala:85)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$apply$2.apply(CSVFileFormat.scala:128)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$apply$2.apply(CSVFileFormat.scala:127)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:91)
如果禁用inferSchema,我不會收到上述異常。 爲什麼我得到這個異常?默認情況下,如果啓用了「inferSchema」,則數據庫將讀取多少行?
您使用的是什麼版本的spark csv? – eliasah
火花csv版本是1.5 – Garipaso
和火花? – eliasah