0
我想將.csv
文件作爲ORC文件加載到Hive表中。無法將文本數據作爲ORC通過臨時Hive表加載到Hive表中
1)創建並加載數據爲文本文件到一個臨時表:
CREATE TABLE IF NOT EXISTS CrimesData(ID int, Case_Number int, CrimeDate string, Block string , IUCR string,Primary_Type string, Description string, Location_Description string, Arrest string, Domestic string, Beat int, District int, Ward int, Community_Area int, FBI_Code string, X_Coordinate int, Y_Coordinate int, Year int, Updated_On string, Latitude decimal(10,10), Longitude decimal(10,10), CrimeLocation string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ESCAPED BY '"' LINES TERMINATED BY '\n'
tblproperties("skip.header.line.count"="1")
LOAD DATA LOCAL INPATH '/home/cloudera/Documents/CrimesData.csv' INTO TABLE CrimesData
我碰到一個post 它提出了一個解決方法的問題,而我執行下面的查詢來2)創建一個新的表和指定ORC的數據作爲源:
CREATE TABLE IF NOT EXISTS CrimesDataORC(ID int, Case_Number int, CrimeDate string, Block string , IUCR string,Primary_Type string, Description string, Location_Description string, Arrest string, Domestic string, Beat int, District int, Ward int, Community_Area int, FBI_Code string, X_Coordinate int, Y_Coordinate int, Year int, Updated_On string, Latitude decimal(10,10), Longitude decimal(10,10), CrimeLocation string)
STORED AS ORC;
3)從臨時表中插入數據的新表:
INSERT INTO TABLE CrimesDataORC SELECT * FROM CrimesData;
前兩個步驟執行沒有任何錯誤,但在第3步中引發以下錯誤:
Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
我上了Cloudera運行上面的查詢Manager Quickstart VM 5.8。
不知道我在哪裏出錯,因爲同一個數據庫中的另一個表的類似步驟按預期工作。
謝謝你的建議,我會試試看,並更新如果工作與否。 –
試圖執行: INSERT INTO TABLE CrimesDataORC SELECT * FROM CrimesData LIMIT 10; 但這也沒有解決.. :( –