2016-11-22 78 views
1

我已經創建了一個oozie工作流程來執行從mysql到sq配置系統的sqoop導入。用於sqoop導入的Oozie工作流程在Amazon emr hue中失敗

用於創建sqoop作業的我的Sqoop作業運行成功,但是當我嘗試執行作業以從MySQL導入到Hive時失敗。在這裏,我附上了日誌

sqoop --hive-import(這是失敗的sqoop動作的作用)發生在兩個步驟中。

  1. 首先將一個sqoop導入到HDFS目錄(在我的xml中引用targetDir)。

  2. 然後將此sqoop導入的輸出移動並導入到Hive中。

當我通過Oozie的運行我sqoop工作,我看到TARGETDIR指示sqoop導入_SUCCESS文件是成功的。只有後期(第2步)失敗。

我以色相用戶身份運行Oozie工作流程。

9020 [uber-SubtaskRunner] INFO org.apache.sqoop.hive.HiveImport - Loading uploaded data into Hive 
9982 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - WARNING: Use "yarn jar" to launch YARN applications. 
10278 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - SLF4J: Class path contains multiple SLF4J bindings. 
10278 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - SLF4J: Found binding in [jar:file:/usr/lib/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
10278 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
10278 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 
10281 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 
12413 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - 
12413 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - Logging initialized using configuration in jar:file:/usr/lib/hive/lib/hive-common-2.1.0-amzn-0.jar!/hive-log4j2.properties Async: true 
13750 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/tez/dag/api/SessionNotRunning 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:586) 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:518) 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:705) 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - at java.lang.reflect.Method.invoke(Method.java:498) 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - at org.apache.hadoop.util.RunJar.run(RunJar.java:221) 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - Caused by: java.lang.ClassNotFoundException: org.apache.tez.dag.api.SessionNotRunning 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - at java.lang.ClassLoader.loadClass(ClassLoader.java:424) 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - at java.lang.ClassLoader.loadClass(ClassLoader.java:357) 
13751 [Thread-112] INFO org.apache.sqoop.hive.HiveImport - ... 10 more 
14098 [uber-SubtaskRunner] ERROR org.apache.sqoop.tool.CreateHiveTableTool - Encountered IOException running create table job: java.io.IOException: Hive exited with status 1 
    at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:389) 
    at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:339) 
    at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:240) 
    at org.apache.sqoop.tool.CreateHiveTableTool.run(CreateHiveTableTool.java:58) 
    at org.apache.sqoop.Sqoop.run(Sqoop.java:143) 
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) 
    at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) 
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) 
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) 
    at org.apache.sqoop.Sqoop.main(Sqoop.java:236) 
    at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:197) 
    at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:177) 
    at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47) 
    at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:46) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:498) 
    at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:236) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) 
    at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:380) 
    at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:301) 
    at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:187) 
    at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:230) 
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 

Intercepting System.exit(1) 

<<< Invocation of Main class completed <<< 

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1] 

回答

0

錯誤是因爲sqoop找不到配置單元。

嘗試在您所有的數據節點的設置蜂巢環境變量哪個用戶運行此sqoop命令

編輯:

鍵入命令行

oozie admin -shareliblist hive 

,並檢查tez-api-*.jar有。

如果不是,則轉到hdfs路徑/user/oozie/share/lib/lib_DATA/hive/並檢查tez-api-*.jar是否存在。如果存在,則更新sharelib oozie admin -sharelibupdate並再次檢查

+0

我在bashrc中設置了HIVE_HOME並刷新了屬性,但沒有運氣。我在我的sqoop工作中也提到了通過設置-hive-home屬性。其他datanode在/ etc /中沒有配置單元文件夾。你可以請我指點或詳細說明你的回答 – Hurix

+0

你是否也加了這個,「export HIVE = $ HIVE_HOME/bin export PATH = $ HIVE:$ PATH」,然後輸入「exec bash」 –

+0

這個問題是因爲oozie會在任何datanode中運行作業,因此您應該在配置中將所有datanode安裝配置單元,然後再試一次 –

0

Sqoop在運行時無法找到Tez jar(即「org.apache.tez.dag.api.SessionNotRunning」類)。

請確認您是否擁有Tez罐子。

+0

當我在主節點上運行Sqoop作業時。它已成功運行,因爲配置了Tez並且我可以在主節點中訪問Hive。我正在嘗試使用Oozie Workflow執行Sqoop作業,它在從節點中運行Hive yarn MapReduce。當我嘗試在從節點中啓動配置單元時,它表示Tez會話未運行,但Hive&Tez配置正確。這是Amazon EMR的侷限性還是我缺少任何主要配置 – Hurix

+0

@Hurix您是否嘗試添加tez-site.xml和必需的tez庫。 – YoungHobbit

+0

@YoungHobbit是的,它是在位 – Hurix