2017-07-26 54 views
1

我想運行一個基本的斯卡拉火花例如讀取JSON文件時火花錯誤。在嘗試閱讀json文件時,出現以下錯誤:試圖從當地FS

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.fs.FileStatus.isDirectory()Z 
at org.apache.spark.sql.execution.datasources.ListingFileCatalog$$anonfun$1$$anonfun$apply$2.apply(ListingFileCatalog.scala:129) 
at org.apache.spark.sql.execution.datasources.ListingFileCatalog$$anonfun$1$$anonfun$apply$2.apply(ListingFileCatalog.scala:116) 
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) 
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) 
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) 
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) 
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) 
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) 
at org.apache.spark.sql.execution.datasources.ListingFileCatalog$$anonfun$1.apply(ListingFileCatalog.scala:116) 
at org.apache.spark.sql.execution.datasources.ListingFileCatalog$$anonfun$1.apply(ListingFileCatalog.scala:102) 
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) 
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) 
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) 
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34) 
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251) 
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105) 
at org.apache.spark.sql.execution.datasources.ListingFileCatalog.listLeafFiles(ListingFileCatalog.scala:102) 
at org.apache.spark.sql.execution.datasources.ListingFileCatalog.refresh(ListingFileCatalog.scala:75) 
at org.apache.spark.sql.execution.datasources.ListingFileCatalog.<init>(ListingFileCatalog.scala:56) 
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:379) 
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149) 
at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:287) 
at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:249) 
at LoadJsonWithSparkSQL$.main(LoadJsonWithSparkSQL.scala:50) 
at LoadJsonWithSparkSQL.main(LoadJsonWithSparkSQL.scala) 17/07/26 17:13:37 INFO spark.SparkContext: Invoking stop() from shutdown hook 

任何想法如何解決該問題?

我的設置是:

火花:2.0.0

階:2.10

所有文件都在我的本地FS。

+1

你是否在前面加上了file://的路徑? – dumitru

+0

是的。我試着用和w-out「file:///」 –

+0

如果你把'file://'放在前面,你會得到什麼錯誤? – jamborta

回答

0

這裏我們可以帶兩個選項 sc.textFile(「file:/// path/file /」)。如果它是文本文件。
否則,如果它的Json文件,那麼你可以嘗試與數據框 df = sqlContext.read.json(「文件」)
請嘗試創建數據框。這個DF很容易探索數據。

+0

我試過val df = sparkSession.sqlContext.read.json(inputFile)。仍然有相同的錯誤。 –