0
我有python腳本,我可以通過spark-submit運行。我需要在Oozie中使用它。我可以在Oozie中運行py spark作爲shell作業嗎?
<!-- move files from local disk to hdfs -->
<action name="forceLoadFromLocal2hdfs">
<shell xmlns="uri:oozie:shell-action:0.3">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>driver-script.sh</exec>
<!-- single -->
<argument>s</argument>
<!-- py script -->
<argument>load_local_2_hdfs.py</argument>
<!-- local file to be moved-->
<argument>localPathFile</argument>
<!-- hdfs destination folder, be aware of, script is deleting existing folder! -->
<argument>hdfFolder</argument>
<file>${workflowRoot}driver-script.sh#driver-script.sh</file>
<file>${workflowRoot}load_local_2_hdfs.py#load_local_2_hdfs.py</file>
</shell>
<ok to="end"/>
<error to="killAction"/>
</action>
該腳本本身通過driver-script.sh運行良好。通過oozie,即使工作流的狀態爲SUCCEEDED,該文件也不會複製到hdfs。我無法找到任何錯誤日誌或相關日誌到pyspark工作。
我通過Oozie的here
你好,我發現紗線下的原木。文件不是從本地複製到hdfs。這是腳本的工作:) –