0

我正在運行HDP 2.5.3oozie 4.2.0。 spark動作被設置爲在yarn-client模式下運行。 Spark作業用於從配置表格中獲取數據,並對其進行處理並將其存儲在HDFS的中。但是當我嘗試從Spark Action提交Spark應用程序時,我得到NullPointerExceptionHortonworks Oozie Spark Action - NullPointerException

workflow.xml

<workflow-app xmlns="uri:oozie:workflow:0.5" name="Spark_Test"> 
    <global> 
     <job-tracker>${job_tracker}</job-tracker> 
     <name-node>${name_node}</name-node> 
    </global> 
    <credentials> 
     <credential name="hiveCredentials" type="hive2"> 
     <property> 
      <name>hive2.jdbc.url</name> 
      <value>${hive_beeline_server}</value> 
     </property> 
     <property> 
      <name>hive2.server.principal</name> 
      <value>${hive_kerberos_principal}</value> 
     </property> 
     </credential> 
    </credentials> 
    <start to="SparkTest" /> 
    <action name="SparkTest" cred="hiveCredentials"> 
     <spark xmlns="uri:oozie:spark-action:0.1"> 
     <job-tracker>${job_tracker}</job-tracker> 
     <name-node>${name_node}</name-node> 
     <master>yarn-client</master> 
     <name>Spark Hive Example</name> 
     <class>com.fbr.genjson.exec.GenExecJson</class> 
     <jar>${jarPath}/fedebomrpt_genjson.jar</jar> 
     <spark-opts>--jars /usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar --files /etc/hive/conf/hive-site.xml --conf spark.sql.hive.convertMetastoreOrc=false --driver-memory 2g --executor-memory 16g --executor-cores 4 --conf spark.ui.port=5051 --queue fbr</spark-opts> 
     <arg>${arg1}</arg> 
     <arg>${arg2}</arg> 
     </spark> 
     <ok to="end" /> 
     <error to="fail" /> 
    </action> 
    <kill name="fail"> 
     <message>Spark Java PatentCitation failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> 
    </kill> 
    <end name="end" /> 
</workflow-app> 

例外:

SERVER[xxx.hpc.xx.com] USER[prxtcbrd] GROUP[-] TOKEN[] APP[Spark_Test] JOB[0004629-170625082345353-oozie-oozi-W] ACTION[[email protected]] Error starting action [SparkTest]. ErrorType [ERROR], ErrorCode [NullPointerException], Message [NullPointerException: null] 
org.apache.oozie.action.ActionExecutorException: NullPointerException: null 
    at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:446) 
    at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1202) 
    at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1373) 
    at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:232) 
    at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63) 
    at org.apache.oozie.command.XCommand.call(XCommand.java:287) 
    at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:331) 
    at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:260) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
    at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 
Caused by: java.lang.NullPointerException 
    at org.apache.oozie.action.hadoop.SparkActionExecutor.setupActionConf(SparkActionExecutor.java:85) 
    at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1091) 
    ... 11 more 

我不知道我在做錯誤..我是否需要添加其他任何配置比個XML hive-site.xml

+0

帖子說'yarn-cluster',但workflow.xml有'yarn-client'模式。請檢查一下。我認爲這個例外發生在火花招聘工作甚至提交之前,同時準備提交工作。 – YoungHobbit

+0

@YoungHobbit我的壞..這是我身邊的錯誤..它只是紗線客戶端。我錯誤地發佈了它。 – user2731629

回答

0

在你的例子中,你導入jar文件(hive-site.xml)。我認爲沒有必要導入這些東西oozie已經導入這些東西。你可以檢查下面的火花行動,我認爲這可能會解決你的問題。

<action name="myfirstsparkjob" cred="hive_credentials"> 
    <spark xmlns="uri:oozie:spark-action:0.1"> 
     <job-tracker>${jobTracker}</job-tracker> 
     <name-node>${nameNode}</name-node> 
     <configuration> 
      <property> 
       <name>mapred.compress.map.output</name> 
       <value>true</value> 
      </property> 
      <property> 
       <name>mapred.job.queue.name</name> 
       <value>${queueName}</value> 
      </property> 
     </configuration> 
     <master>yarn</master> 
     <mode>cluster</mode> 
     <name>Spark Hive Example</name> 
     <class>com.fbr.genjson.exec.GenExecJson</class> 
     <jar>${jarPath}/fedebomrpt_genjson.jar</jar> 
     <spark-opts>--queue queue_name --executor-memory 28G --num-executors 70 --executor-cores 5</spark-opts> 
    </spark> 
    <ok to="end" /> 
    <error to="fail" /> 

,也可以設置在下面你Oozie的性能workflow.xml文件

oozie.use.system.libpath=true oozie.libpath=${jarPath}

確保你把你的所有用戶創建的庫和文件,你的$ {} jar文件裏面

相關問題