我在AWS上運行帶Spark集羣的EMR。 星火版本是1.6將Spark CSV依賴關係添加到Zeppelin
當運行folllowing命令:
proxy = sqlContext.read.load("/user/zeppelin/ProxyRaw.csv",
format="com.databricks.spark.csv",
header="true",
inferSchema="true")
我得到以下錯誤:
Py4JJavaError: An error occurred while calling o162.load. : java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.csv. Please find packages at http://spark-packages.org at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
我怎樣才能解決這個問題?我假設我應該添加一個包,但是如何安裝它並在哪裏?
請接受關閉該問題的答案! – eliasah