2016-09-06 103 views
0

我試圖從spark中以獨立模式將數據保存在cassandra中。通過運行以下命令:Spark Cassandra與spark-cassandra連接器集成中的錯誤

bin/spark-submit --packages datastax:spark-cassandra-connector:1.6.0-s_2.10 
    --class "pl.japila.spark.SparkMeApp" --master local /home/hduser2/code14/target/scala-2.10/simple-project_2.10-1.0.jar 

我build.sbt文件是: -

**name := "Simple Project"  
version := "1.0"  
scalaVersion := "2.10.4"  
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.0"  
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.6.0"  
resolvers += "Spark Packages Repo" at "https://dl.bintray.com/spark-packages/maven"  
libraryDependencies += "datastax" % "spark-cassandra-connector" % "1.6.0-s_2.10"  
libraryDependencies ++= Seq(  
    "org.apache.cassandra" % "cassandra-thrift" % "3.5" ,  
    "org.apache.cassandra" % "cassandra-clientutil" % "3.5",  
    "com.datastax.cassandra" % "cassandra-driver-core" % "3.0.0"  
)** 

我星火代碼: -

package pl.japila.spark  
import org.apache.spark.sql._  
import com.datastax.spark.connector._  
import com.datastax.driver.core._  
import com.datastax.spark.connector.cql._ 
import org.apache.spark.{SparkContext, SparkConf}  
import com.datastax.driver.core.QueryOptions._  
import org.apache.spark.SparkConf  
import com.datastax.driver.core._ 
import com.datastax.spark.connector.rdd._ 

object SparkMeApp { 
    def main(args: Array[String]) { 

val conf = new SparkConf(true).set("spark.cassandra.connection.host", "127.0.0.1") 

    val sc = new SparkContext("local", "test", conf)  
    val sqlContext = new org.apache.spark.sql.SQLContext(sc)  
    val rdd = sc.cassandraTable("test", "kv")  
    val collection = sc.parallelize(Seq(("cat", 30), ("fox", 40))) 

collection.saveToCassandra("test", "kv", SomeColumns("key", "value")) 
    } 
} 

而且我得到了這個錯誤: -

Exception in thread "main" java.lang.NoSuchMethodError: com.datastax.driver.core.QueryOptions.setRefreshNodeIntervalMillis(I)Lcom/datastax/driver/core/QueryOptions;** at com.datastax.spark.connector.cql.DefaultConnectionFactory$.clusterBuilder(CassandraConnectionFactory.scala:49) at com.datastax.spark.connector.cql.DefaultConnectionFactory$.createCluster(CassandraConnectionFactory.scala:92) at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:153) at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$3.apply(CassandraConnector.scala:148) at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$3.apply(CassandraConnector.scala:148) at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:31) at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:56) at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:81) at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:109)

使用的版本有: -
火花 - 1.6.0
斯卡拉 - 2.10.4
卡桑德拉驅動器核心罐子 - 3.0.0
卡桑德拉版本2.2.7
火花卡桑德拉連接器 - 1.6.0-s_2.10

某人請幫助!

回答

1

我會通過移除

libraryDependencies ++= Seq(  
    "org.apache.cassandra" % "cassandra-thrift" % "3.5" ,  
    "org.apache.cassandra" % "cassandra-clientutil" % "3.5",  
    "com.datastax.cassandra" % "cassandra-driver-core" % "3.0.0"  
) 

由於它們是在連接器的依賴關係的庫開始將自動與該包的依賴性包括在內。

然後我會通過啓動火花外殼採用

./bin/spark-shell --packages datastax:spark-cassandra-connector:1.6.0-s_2.10 

你看到了如下決議發生正確

datastax#spark-cassandra-connector added as a dependency 
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0 
     confs: [default] 
     found datastax#spark-cassandra-connector;1.6.0-s_2.10 in spark-packages 
     found org.apache.cassandra#cassandra-clientutil;3.0.2 in list 
     found com.datastax.cassandra#cassandra-driver-core;3.0.0 in list 
     ... 
     [2.10.5] org.scala-lang#scala-reflect;2.10.5 
:: resolution report :: resolve 627ms :: artifacts dl 10ms 
     :: modules in use: 
     com.datastax.cassandra#cassandra-driver-core;3.0.0 from list in [default] 
     com.google.guava#guava;16.0.1 from list in [default] 
     com.twitter#jsr166e;1.1.0 from list in [default] 
     datastax#spark-cassandra-connector;1.6.0-s_2.10 from spark-packages in [default] 
     ... 

如果這些看似正確解析,但一切都還沒有按」測試包分辨率t工作,我會嘗試清除這些工件的緩存。

+0

非常感謝Russ,我解決了我的問題。這是緩存問題。 –

相關問題