2015-01-14 35 views
0

我嘗試在啓用Tez的Azure HDInsight上的Hive上創建索引。 我可以成功創建索引,但我不能重建它們:作業失敗,這樣的輸出:使用Tez在Azure HDInsight上的Hive上重建索引失敗

Map 1: -/- Reducer 2: 0/1 
Status: Failed 
Vertex failed, vertexName=Map 1, vertexId=vertex_1421234198072_0091_1_01, diagnostics=[Vertex Input: measures initializer failed.] 
Vertex killed, vertexName=Reducer 2, vertexId=vertex_1421234198072_0091_1_00, diagnostics=[Vertex > received Kill in INITED state.] 
DAG failed due to vertex failure. failedVertices:1 killedVertices:1 
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask 

我創建了我的表和索引具有以下工作:

DROP TABLE IF EXISTS Measures; 
CREATE TABLE Measures(
    topology string, 
    val double, 
    date timestamp, 
) 
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' 
STORED AS TEXTFILE LOCATION 'wasb://<mycontainer>@<mystorage>.blob.core.windows.net/'; 

CREATE INDEX measures_index_topology ON TABLE Measures (topology) AS 'COMPACT' WITH DEFERRED REBUILD; 
CREATE INDEX measures_index_date ON TABLE Measures (date) AS 'COMPACT' WITH DEFERRED REBUILD; 
ALTER INDEX measures_index_topology ON Measures REBUILD; 
ALTER INDEX measures_index_date ON Measures REBUILD; 

我在哪裏錯了?爲什麼我的重建索引失敗? 致以問候

回答

0

看起來Tez在空表上生成索引時可能會遇到問題。我能得到同樣的錯誤,你(不使用JSON SERDE),如果你看一下失敗DAG中的應用程序日誌,您可能會看到這樣的:

java.lang.NullPointerException 
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254) 
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:299) 
    at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getSplits(TezGroupedSplitsInputFormat.java:68) 
    at org.apache.tez.mapreduce.hadoop.MRHelpers.generateOldSplits(MRHelpers.java:263) 
    at org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:139) 
    at org.apache.tez.dag.app.dag.RootInputInitializerRunner$InputInitializerCallable$1.run(RootInputInitializerRunner.java:154) 
    at org.apache.tez.dag.app.dag.RootInputInitializerRunner$InputInitializerCallable$1.run(RootInputInitializerRunner.java:146) 
    ... 

如果填充表與一個單一的虛擬記錄,它似乎工作正常。我用過:

INSERT INTO TABLE Measures SELECT market,0,0 FROM hivesampletable limit 1; 

之後,索引重建能夠正常運行。

相關問題