無法將文本數據作爲ORC通過臨時Hive表加載到Hive表中

我想將.csv文件作爲ORC文件加載到Hive表中。無法將文本數據作爲ORC通過臨時Hive表加載到Hive表中

1）創建並加載數據爲文本文件到一個臨時表：

CREATE TABLE IF NOT EXISTS CrimesData(ID int, Case_Number int, CrimeDate string, Block string , IUCR string,Primary_Type string, Description string, Location_Description string, Arrest string, Domestic string, Beat int, District int, Ward int, Community_Area int, FBI_Code string, X_Coordinate int, Y_Coordinate int, Year int, Updated_On string, Latitude decimal(10,10), Longitude decimal(10,10), CrimeLocation string) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ESCAPED BY '"' LINES TERMINATED BY '\n' 
tblproperties("skip.header.line.count"="1") 
LOAD DATA LOCAL INPATH '/home/cloudera/Documents/CrimesData.csv' INTO TABLE CrimesData

我碰到一個post 它提出了一個解決方法的問題，而我執行下面的查詢來2）創建一個新的表和指定ORC的數據作爲源：

CREATE TABLE IF NOT EXISTS CrimesDataORC(ID int, Case_Number int, CrimeDate string, Block string , IUCR string,Primary_Type string, Description string, Location_Description string, Arrest string, Domestic string, Beat int, District int, Ward int, Community_Area int, FBI_Code string, X_Coordinate int, Y_Coordinate int, Year int, Updated_On string, Latitude decimal(10,10), Longitude decimal(10,10), CrimeLocation string) 
STORED AS ORC;

3）從臨時表中插入數據的新表：

INSERT INTO TABLE CrimesDataORC SELECT * FROM CrimesData;

前兩個步驟執行沒有任何錯誤，但在第3步中引發以下錯誤：

Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

我上了Cloudera運行上面的查詢Manager Quickstart VM 5.8。

不知道我在哪裏出錯，因爲同一個數據庫中的另一個表的類似步驟按預期工作。

來源

2017-04-17 Chetan SP

這可能是一種不符合結構的數據。嘗試設置一些條件在選擇語句來檢查，而不是插入所有的數據

來源

2017-04-18 02:44:52 sadap

謝謝你的建議，我會試試看，並更新如果工作與否。 –

試圖執行： INSERT INTO TABLE CrimesDataORC SELECT * FROM CrimesData LIMIT 10; 但這也沒有解決.. :( –

無法將文本數據作爲ORC通過臨時Hive表加載到Hive表中

回答

相關問題