0
從官方DOC,我們可以看到:連接到elasticsearch 2.4.4火花2.X
elasticsearch-hadoop allows Elasticsearch to be used in Spark in two ways: through the dedicated support available since 2.1 or through the Map/Reduce bridge since 2.0
但是,當我嘗試通過專用支持像下面方式:
import org.elasticsearch.spark.rdd.api.java.JavaEsSpark;
SparkConf conf = new SparkConf().setAppName("MyApp")
.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
.set("es.nodes", "localhost")
.set("es.port", "9200")
.set("es.resource", "test/main")
.set("es.index.auto.create", "true");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> input = sc.textFile("file:///home/zht/PycharmProjects/test/text_file.txt");
JavaRDD<Map<String, String>> formattedRdd = input.map(...);
JavaEsSpark.saveToEs(formattedRdd, "test/spark");
和運行commind行:
spark-submit --conf spark.es.resource=test/main --jars $SPARK_HOME/jars/elasticsearch-hadoop-2.4.4.jar --class org.spark_examples.something.launchSpark SparkExamples-1.0-SNAPSHOT-jar-with-dependencies.jar
我得到了一個錯誤:
java.lang.NoSuchMethodError: org.apache.spark.TaskContext.addOnCompleteCallback(Lscala/Function0;)V
at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:42)
at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:84)
at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:84)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
如何解決這個問題?