4

我已經建立了獨立的單個節點「集羣」運行下列:齊柏林火花RDD命令失敗火花殼工作尚未

  • 卡桑德拉2.2.2
  • 星火1.5.1
  • 列表項
  • 編譯脂肪罐用於火花卡桑德拉 - 連接器1.5.0-M2
  • 編譯齊柏林0.6快照 編譯: MVN -Pspark-1.5 -Dspark.version = 1.5.1 -Dhadoop.version = 2.6.0 -Ph adoop-2.4 -DskipTests清潔套裝

我可以工作與卡桑德拉

我已經改變了Zeppelin-env.sh如下火花外殼檢索數據完美的罰款:

export MASTER=spark://localhost:7077 
export SPARK_HOME=/root/spark-1.5.1-bin-hadoop2.6/ 
export ZEPPELIN_PORT=8880 
export ZEPPELIN_JAVA_OPTS="-Dspark.jars=/opt/sparkconnector/spark-cassandra-connector-assembly-1.5.0-M2-SNAPSHOT.jar -Dspark.cassandra.connection.host=localhost" 
export ZEPPELIN_NOTEBOOK_DIR="/root/gowalla-spark-demo/notebooks/zeppelin" 
export SPARK_SUBMIT_OPTIONS="--jars /opt/sparkconnector/spark-cassandra-connector-assembly-1.5.0-M2-SNAPSHOT.jar --deploy-mode cluster" 
export ZEPPELIN_INTP_JAVA_OPTS=$ZEPPELIN_JAVA_OPTS 

然後我開始向筆記本添加段落並首先輸入以下內容:

import com.datastax.spark.connector._ 
import com.datastax.spark.connector.cql._ 
import com.datastax.spark.connector.rdd.CassandraRDD 
import org.apache.spark.rdd.RDD 
import org.apache.spark.SparkContext 
import org.apache.spark.SparkConf 

不確定是否所有這些都是必需的。本段運行良好。

然後我做到以下幾點:

val checkins = sc.cassandraTable("lbsn", "checkins") 

這運行正常,並返回:

checkins: com.datastax.spark.connector.rdd.CassandraTableScanRDD[com.datastax.spark.connector.CassandraRow] = CassandraTableScanRDD[0] at RDD at CassandraRDD.scala:15 

然後在下一段 - 後續2個語句運行-the第一成功和失敗,第二次:

checkins.count 
checkins.first 

結果:

res13: Long = 138449 
com.fasterxml.jackson.databind.JsonMappingException: Could not find creator property with name 'id' (in class org.apache.spark.rdd.RDDOperationScope) 
at [Source: {"id":"4","name":"first"}; line: 1, column: 1] 
at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148) 
at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:843) 
at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.addBeanProps(BeanDeserializerFactory.java:533) 
at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.buildBeanDeserializer(BeanDeserializerFactory.java:220) 
at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.createBeanDeserializer(BeanDeserializerFactory.java:143) 
at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:409) 
at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:358) 
at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:265) 
at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:245) 
at com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:143) 
at com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:439) 
at com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:3666) 
at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3558) 
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2578) 
at org.apache.spark.rdd.RDDOperationScope$.fromJson(RDDOperationScope.scala:82) 
at org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1582) 
at org.apache.spark.rdd.RDD$$anonfun$34.apply(RDD.scala:1582) 
at scala.Option.map(Option.scala:145) 
at org.apache.spark.rdd.RDD.<init>(RDD.scala:1582) 
at com.datastax.spark.connector.rdd.CassandraRDD.<init>(CassandraRDD.scala:15) 
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.<init>(CassandraTableScanRDD.scala:59) 
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.copy(CassandraTableScanRDD.scala:92) 
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.copy(CassandraTableScanRDD.scala:59) 
at com.datastax.spark.connector.rdd.CassandraRDD.limit(CassandraRDD.scala:103) 
at com.datastax.spark.connector.rdd.CassandraRDD.take(CassandraRDD.scala:122) 
at org.apache.spark.rdd.RDD$$anonfun$first$1.apply(RDD.scala:1312) 
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147) 
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108) 
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306) 
at org.apache.spark.rdd.RDD.first(RDD.scala:1311) 
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:36) 
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:41) 
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:43) 
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:45) 
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:47) 
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:49) 
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:51) 
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:53) 
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:55) 
at $iwC$$iwC$$iwC.<init>(<console>:57) 
at $iwC$$iwC.<init>(<console>:59) 
at $iwC.<init>(<console>:61) 
at <init>(<console>:63) 
at .<init>(<console>:67) 
at .<clinit>(<console>) 
at .<init>(<console>:7) 
at .<clinit>(<console>) 
at $print(<console>) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:497) 
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) 
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340) 
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) 
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) 
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) 
at org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:655) 
at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:620) 
at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:613) 
at org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57) 
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93) 
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) 
at org.apache.zeppelin.scheduler.Job.run(Job.java:170) 
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) 
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) 
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745) 

爲什麼呼叫首先失敗。諸如sc.fromTextFile的調用也失敗。

以下也適用:

checkins.where("year = 2010 and month=2 and day>12 and day<15").count() 

但這並不:

checkins.where("year = 2010 and month=2 and day>12 and day<15").first() 

請幫助,因爲這是推動我瘋了。特別是因爲火花外殼工作,但這不,或至少似乎部分破碎。

感謝

回答

0

com.fasterxml.jackson.databind.JsonMappingException: Could not find creator property with name 'id' (in class org.apache.spark.rdd.RDDOperationScope) at [Source: {"id":"4","name":"first"}; line: 1, column: 1]

當是兩個或兩個以上的版本在classpath傑克遜庫發生此異常。

確保您的Spark解釋器流程在類路徑中只有一個版本的jackson庫。

+0

下面是包含aws-java-sdk時發生此問題的示例。這種情況下的解決方案將是類似的。 https://開頭計算器。COM /問題/ 45511804 /飛艇火花讀實木複合地板,從-S3拋出-的NoSuchMethodError-COM-fasterxm/45842852#45842852 – Greg