2015-07-06 85 views
0

我已經使用hadoop-0.20.x.x,hive-0.11.0。我會討論配置單元查詢:使用指定的配置,每件事情都很好,工作正常。 現在,我們已經升級到hadoop-2.6.x(hadoop2)和hive-0.14.x。還使用Apache Tez。Hive with Tez,在作業中未指定任何輸入路徑

問題是,hadoop的工作原理。但是,hive sql查詢不會。 以下查詢在舊版本中正常工作。但扔在升級版的錯誤: QUERY:SELECT abc.property_name, xyz.date, xyz.time, xyz.value_as_number, xyz.value_units FROM dbname.xyz JOIN dbname.abc ON (xyz.id = abc.src_id) WHERE xyz.person_id=138312;

例外:

INFO : Session is already open 
INFO : Tez session was closed. Reopening... 
INFO : Session re-established. 
INFO : 

INFO : Status: Running (Executing on YARN cluster with App id application_1435524970199_0035) 

INFO : Map 1: -/-  Map 2: -/- 
ERROR : Status: Failed 
ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1435524970199_0035_1_00, diagnostics=[Vertex vertex_1435524970199_0035_1_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: concept initializer failed, vertex=vertex_1435524970199_0035_1_00 [Map 1], java.io.IOException: No input paths specified in job 
     at org.apache.hadoop.hive.ql.io.HiveInputFormat.getInputPaths(HiveInputFormat.java:318) 
     at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:328) 
     at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:130) 
     at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) 
     at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) 
     at java.security.AccessController.doPrivileged(Native Method) 
     at javax.security.auth.Subject.doAs(Subject.java:415) 
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) 
     at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) 
     at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) 
     at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
     at java.lang.Thread.run(Thread.java:745) 
] 
ERROR : Vertex failed, vertexName=Map 2, vertexId=vertex_1435524970199_0035_1_01, diagnostics=[Vertex vertex_1435524970199_0035_1_01 [Map 2] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: observation initializer failed, vertex=vertex_1435524970199_0035_1_01 [Map 2], java.io.IOException: No input paths specified in job 
     at org.apache.hadoop.hive.ql.io.HiveInputFormat.getInputPaths(HiveInputFormat.java:318) 
     at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:328) 
     at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:130) 
     at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) 
     at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) 
     at java.security.AccessController.doPrivileged(Native Method) 
     at javax.security.auth.Subject.doAs(Subject.java:415) 
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) 
     at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) 
     at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) 
     at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
     at java.lang.Thread.run(Thread.java:745) 
] 
ERROR : DAG failed due to vertex failure. failedVertices:2 killedVertices:0 
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask (state=08S01,code=2) 

異常說,No input path specified。那麼,我明白並知道如何解決haodop-mapreduce程序。但是,我們如何使用配置單元查詢來做到這一點。無論如何,我不認爲這是一樣的。

做出來的,我已經使用hive shellbeeline shell,蜂巢返回預期的輸出,但直線返回相同的異常如上。

這個問題的美妙之處在於對單個表格的查詢工作正常。但是,當我嘗試在JOIN上工作時,它會引發上述異常。 但是,我瞭解到,我的查詢有Apache Tez的影響。有人可以建議解決方案或指向tez引用,所以我可以相應地讀取和重寫查詢。謝謝

回答

0

它通過禁用apache tez。 看起來像apache tez還不穩定。