2017-08-16 95 views
2

我是Apache Spark中的新成員,我想將spark連接到Cassandra數據庫。在Spark java API中使用spark cassandra連接器讀取數據的問題

  • 星火版本:2.2.0
  • 卡桑德拉版本:2.1.14

誤差以低於線發生

(long count = javaFunctions(sc).cassandraTable("test", "table1").count();) 

我的Java代碼如下:

public SparkTestPanel(String id, User user) { 
    super(id); 
    form = new Form("form"); 
    form.setOutputMarkupId(true); 
    this.add(form); 
    SparkConf conf = new SparkConf(true); 
    conf.setAppName("Spark Test"); 
    conf.setMaster("local[*]"); 
    conf.set("spark.cassandra.connection.host", "127.0.0.1"); 
    conf.set("spark.cassandra.auth.username", "username"); 
    conf.set("spark.cassandra.auth.password", "password"); 
    JavaSparkContext sc=null; 
    try { 
     sc = new JavaSparkContext(conf); 
     long count = javaFunctions(sc).cassandraTable("test", "table1").count(); 

    } finally { 
     sc.stop(); 
    } 
} 

而我的pom.xm l爲這樣的:

<!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector-java_2.10 --> 
    <dependency> 
     <groupId>com.datastax.spark</groupId> 
     <artifactId>spark-cassandra-connector-java_2.11</artifactId> 
     <version>1.5.2</version> 
    </dependency> 
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 --> 
    <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-core_2.11</artifactId> 
     <version>2.2.0</version> 
    </dependency> 
    <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-sql_2.11</artifactId> 
     <version>2.2.0</version> 
    </dependency> 
    <!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-databind --> 
    <dependency> 
     <groupId>com.fasterxml.jackson.core</groupId> 
     <artifactId>jackson-databind</artifactId> 
     <version>2.6.5</version> 
    </dependency> 
    <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-streaming_2.11</artifactId> 
     <version>2.2.0</version> 
    </dependency> 
    <dependency> 
     <groupId>com.datastax.spark</groupId> 
     <artifactId>spark-cassandra-connector_2.11</artifactId> 
     <version>2.0.3</version> 
    </dependency> 
    <dependency> 
     <groupId>com.datastax.cassandra</groupId> 
     <artifactId>cassandra-driver-core</artifactId> 
     <version>3.2.0</version> 
     <type>jar</type> 
    </dependency> 

而且我得到這個錯誤:

[Stage 0:=======================>         (4 + 4)/10] 
[Stage 0:>               (0 + 4)/10] 
[Stage 0:=====>             (1 + 4)/10] 
[Stage 0:=================>          (3 + 4)/10] 
2017-08-16 12:22:48,857 ERROR Executor:91 - Exception in task 6.0 in stage 0.0 (TID 6) 
java.lang.AbstractMethodError: com.datastax.spark.connector.japi.GenericJavaRowReaderFactory$JavaRowReader.read(Lcom/datastax/driver/core/Row;Lcom/datastax/spark/connector/CassandraRowMetadata;)Ljava/lang/Object; 
at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$15.apply(CassandraTableScanRDD.scala:345) 
at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$15.apply(CassandraTableScanRDD.scala:345) 
at scala.collection.Iterator$$anon$11.next(Iterator.scala:370) 
at scala.collection.Iterator$$anon$12.next(Iterator.scala:397) 
at com.datastax.spark.connector.util.CountingIterator.next(CountingIterator.scala:16) 
at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1801) 
at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1158) 
at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1158) 
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2062) 
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2062) 
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) 
at org.apache.spark.scheduler.Task.run(Task.scala:108) 
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:748) 
2017-08-16 12:22:48,861 ERROR TaskSetManager:70 - Task 6 in stage 0.0 failed 1 times; aborting job 

誰能幫助?

+0

提到的版本需要Java 8 ...檢查你的Java版本 –

+0

我的Java版本是java版的 「1.8.0_131」。 @undefined_variable – MohamadrezaRostami

回答

2

刪除spark-cassandra-connector-java_2.11cassandra-driver-core依賴關係。你的pom.xml應該如下所示。

<dependency> 
    <groupId>org.apache.spark</groupId> 
    <artifactId>spark-core_2.11</artifactId> 
    <version>2.2.0</version> 
</dependency> 
<dependency> 
    <groupId>org.apache.spark</groupId> 
    <artifactId>spark-sql_2.11</artifactId> 
    <version>2.2.0</version> 
</dependency> 
<dependency> 
    <groupId>org.apache.spark</groupId> 
    <artifactId>spark-streaming_2.11</artifactId> 
    <version>2.2.0</version> 
</dependency> 
<dependency> 
    <groupId>com.datastax.spark</groupId> 
    <artifactId>spark-cassandra-connector_2.11</artifactId> 
    <version>2.0.3</version> 
</dependency> 
相關問題