0
我想讀星火/寫卡珊德拉和使用這些依賴關係:爲什麼從Cassandra加載數據集失敗並返回NullPointerException?
"com.datastax.spark" % "spark-cassandra-connector-unshaded_2.11" % "2.0.0-M3",
"com.datastax.cassandra" % "cassandra-driver-core" % "3.0.0"
這是代碼:
import com.datastax.spark.connector._
val sparkConf: SparkConf = new SparkConf().setAppName(appName)
.set("spark.cassandra.connection.host", hostname)
.set("spark.cassandra.auth.username",user)
.set("spark.cassandra.auth.password",password)
val spark = SparkSession.builder().config(sparkConf).getOrCreate()
val df = spark
.read
.format("org.apache.spark.sql.cassandra")
.options(Map("table" -> s"$TABLE", "keyspace" -> s"$KEYSPACE"))
.load() // This Dataset will use a spark.cassandra.input.size of 128
但是在試圖引發提交,我得到這個以上
Exception in thread "main" java.lang.NullPointerException
at com.datastax.driver.core.Cluster$Manager.close(Cluster.java:1516)
at com.datastax.driver.core.Cluster$Manager.access$200(Cluster.java:1237)
at com.datastax.driver.core.Cluster.closeAsync(Cluster.java:540)
at com.datastax.driver.core.Cluster.close(Cluster.java:551)
at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:162)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$3.apply(CassandraConnector.scala:149)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$3.apply(CassandraConnector.scala:149)
at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:31)
at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:56)
at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:82)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:110)
at com.datastax.spark.connector.rdd.partitioner.dht.TokenFactory$.forSystemLocalPartitioner(TokenFactory.scala:98)
at org.apache.spark.sql.cassandra.CassandraSourceRelation$.apply(CassandraSourceRelation.scala:255)
at org.apache.spark.sql.cassandra.DefaultSource.createRelation(DefaultSource.scala:55)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:345)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:122)
感謝您的解釋!鏈接幫助。我刪除了驅動程序。我現在正在執行此操作(火花芯等除外): // sbt '「com.datastax.spark」%%「spark-cassandra-connector」%「2.0.2」%「provided」,' //代碼 'val df = spark.read.format(「org.apache.spark.sql.cassandra」)。選項(Map(「table」 - > s「$ TABLE」,「keyspace」 - > s「 $ KEYSPACE「))。load()' 和我的jar是一個具有所有依賴關係的胖jar。我正在做一個火花提交。但我又看到了同樣的NPE。 –
SparkBuildExamples中的示例會這樣說:「請注意,spark-cassandra連接器應該帶有'--packages'標誌以提供spark-submit命令。即使是火花提交,我是否也需要這樣做?我認爲只有在火星殼上才需要這樣做。會導致錯誤嗎?如果是的話,我的火花提交應該是什麼樣子? –
我從SparkBuildExamples鏈接中獲取了示例,我正在嘗試使用WriteRead。原來的NPE已經不存在了,現在我得到了:線程中的異常「main」java.lang.NoSuchMethodError:com.datastax.spark.connector.cql.CassandraConnector $ .apply(Lorg/apache/spark/SparkContext;)Lcom/datastax/spark/connector/cql/CassandraConnector;'有什麼建議?我正在使用如上所述的spark-core,spark-sql和Cassandra連接器罐(沒有配置單元,這是否重要?) –