2016-12-22 34 views
2

相同的代碼可以在Spark獨立運行,但是當我在紗線上運行火花時它在Yarn上運行失敗。例外是:java.lang.NoClassDefFoundError: Could not initialize class org.elasticsearch.common.xcontent.json.JsonXContent扔在執行器(紗容器)。但是當我使用maven組裝時,我確實在應用程序組裝jar中包含了elasticSearch jar。運行命令如下:在紗線上運行Spark時沒有發現類別

spark-submit --executor-memory 10g --executor-cores 2 --num-executors 2 
--queue thejob --master yarn --class com.batch.TestBat /lib/batapp-mr.jar 2016-12-20 

Maven的依賴關係如下,請:

<dependency> 
    <groupId>org.apache.spark</groupId> 
    <artifactId>spark-hive_2.10</artifactId> 
    <version>1.6.0</version> 
    <scope>provided</scope> 
</dependency> 
<dependency> 
    <groupId>org.apache.spark</groupId> 
    <artifactId>spark-mllib_2.10</artifactId> 
    <version>1.6.0</version> 
    <scope>provided</scope> 
</dependency> 
<dependency> 
    <groupId>org.apache.spark</groupId> 
    <artifactId>spark-core_2.10</artifactId> 
    <version>1.6.0</version> 
    <scope>provided</scope> 
</dependency> 
<dependency> 
    <groupId>org.apache.spark</groupId> 
    <artifactId>spark-sql_2.10</artifactId> 
    <version>1.6.0</version> 
    <scope>provided</scope> 
</dependency> 
<dependency> 
    <groupId>org.apache.spark</groupId> 
    <artifactId>spark-catalyst_2.10</artifactId> 
    <version>1.6.0</version> 
    <scope>provided</scope> 
</dependency> 

<dependency> 
    <groupId>com.fasterxml.jackson.core</groupId> 
    <artifactId>jackson-core</artifactId> 
    <version>2.6.3</version> 
    <!-- <scope>provided</scope> --> 
</dependency> 
<dependency> 
    <groupId>org.apache.hbase</groupId> 
    <artifactId>hbase-client</artifactId> 
    <version>1.2.0-cdh5.7.0</version> 
    <!--<scope>provided</scope> --> 
</dependency> 
<dependency> 
    <groupId>org.apache.hbase</groupId> 
    <artifactId>hbase-server</artifactId> 
    <version>1.2.0-cdh5.7.0</version> 
    <!--<scope>provided</scope> --> 
</dependency> 
<dependency> 
    <groupId>org.apache.hbase</groupId> 
    <artifactId>hbase-protocol</artifactId> 
    <version>1.2.0-cdh5.7.0</version> 
    <!--<scope>provided</scope> --> 
</dependency> 
<dependency> 
    <groupId>org.apache.hbase</groupId> 
    <artifactId>hbase-hadoop2-compat</artifactId> 
    <version>1.2.0-cdh5.7.0</version> 
    <!--<scope>provided</scope> --> 
</dependency> 
<dependency> 
    <groupId>org.apache.hbase</groupId> 
    <artifactId>hbase-common</artifactId> 
    <version>1.2.0-cdh5.7.0</version> 
    <!--<scope>provided</scope> --> 
</dependency> 
<dependency> 
    <groupId>org.apache.hbase</groupId> 
    <artifactId>hbase-hadoop-compat</artifactId> 
    <version>1.2.0-cdh5.7.0</version> 
    <!--<scope>provided</scope> --> 
</dependency> 


<dependency> 
    <groupId>com.sksamuel.elastic4s</groupId> 
    <artifactId>elastic4s-core_2.10</artifactId> 
    <version>2.3.0</version> 
    <!--<scope>provided</scope> --> 
    <exclusions> 
     <exclusion> 
      <artifactId>elasticsearch</artifactId> 
      <groupId>org.elasticsearch</groupId> 
     </exclusion> 
    </exclusions> 
</dependency> 
<dependency> 
    <groupId>org.elasticsearch</groupId> 
    <artifactId>elasticsearch</artifactId> 
    <version>2.3.2</version> 
</dependency> 
<dependency> 
    <groupId>org.elasticsearch</groupId> 
    <artifactId>elasticsearch-hadoop</artifactId> 
    <version>2.3.1</version> 
    <exclusions> 
     <exclusion> 
      <artifactId>log4j-over-slf4j</artifactId> 
      <groupId>org.slf4j</groupId> 
     </exclusion> 
    </exclusions> 
</dependency> 

奇怪的是,執行人可以找到HBase的罐子和ElasticSearch罐子其中既包含關係,但沒有ElasticSearch一些類,所以我想可能會有一些類的衝突。我檢查了包含「缺課」的裝配罐。

+1

你能在這裏創建依賴列表? – mrsrinivas

+0

我剛剛添加了,請多多指教! – Jack

+0

請檢查我的答案希望有幫助! –

回答

2

我可以看到,您已經有included the jar dependency。 此外,您還評論了依賴關係provided意味着它將被打包並且可用於您的部署。

<dependency> 
      <groupId>com.fasterxml.jackson.core</groupId> 
      <artifactId>jackson-core</artifactId> 
      <version>2.6.3</version> 
     </dependency> 

唯一我懷疑/確定是火花提交請檢查下面。

--conf "spark.driver.extraLibrayPath=$HADOOP_HOME/*:$HBASE_HOME/*:$HADOOP_HOME/lib/*:$HBASE_HOME/lib/htrace-core-3.1.0-incubating.jar:$HDFS_PATH/*:$SOLR_HOME/*:$SOLR_HOME/lib/*" \ 
     --conf "spark.executor.extraLibraryPath=$HADOOP_HOME/*" \ 
--conf "spark.driver.extraClassPath=$(echo /your directory of jars/*.jar | tr ' ' ',') 
      --conf "spark.executor.extraClassPath=$(echo /your directory of jars/*.jar | tr ' ' ',') 

其中您的罐子的目錄中提取的lib從發行。
您還可以打印的Classpath像下面從程序

val cl = ClassLoader.getSystemClassLoader 
cl.asInstanceOf[java.net.URLClassLoader].getURLs.foreach(pri‌​ntln) 

編輯:以上線路運行後,如果發現存在於類路徑舊副本罐子,然後包括你的 應用或使用--jars您的庫,但也嘗試設置 spark.{driver,executor}.userClassPathFirsttrue

+0

它對您有幫助嗎? –

+0

嗨公羊,我找到了原因,當我打印出你提到的ClassLoader.getSystemClassLoader時,我發現它使用的文件是:/opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p0.45/ jars/jackson-core-2.2.3.jar,但我確實需要jackson-core-2.6.3,我該如何重寫cloudera系統jar。 – Jack

+0

@Jack這就是事情,解決這些衝突在Spark中是非常棘手的,在那裏你使用了Spark所做的庫。包括您的圖書館與您的 應用程序或使用--jars,但也嘗試設置 '火花。{驅動程序,執行者} .userClassPathFirst'到'true' –

相關問題