0
我們都跟着下面的步驟,無法查詢到蜂巢記錄,當存儲爲AVRO格式的數據,返回「error_error ...」異常
進口表從MySQL到HDFS位置
user/hive/warehouse/orders/
,表模式作爲mysql> describe orders; +-------------------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------------------+-------------+------+-----+---------+-------+ | order_id | int(11) | YES | | NULL | | | order_date | varchar(30) | YES | | NULL | | | order_customer_id | int(11) | YES | | NULL | | | order_items | varchar(30) | YES | | NULL | | +-------------------+-------------+------+-----+---------+-------+
使用來自相同數據創建外部表在配置單元(1)。
CREATE EXTERNAL TABLE orders ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' LOCATION 'hdfs:///user/hive/warehouse/retail_stage.db/orders' TBLPROPERTIES ('avro.schema.url'='hdfs://host_name//tmp/sqoop-cloudera/compile/bb8e849c53ab9ceb0ddec7441115125d/orders.avsc');
Sqoop命令:
sqoop import \ --connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" \ --username=root \ --password=cloudera \ --table orders \ --target-dir /user/hive/warehouse/retail_stage.db/orders \ --as-avrodatafile \ --split-by order_id
描述格式的命令,返回錯誤,嘗試了很多組合,但失敗了。
hive> describe orders; OK error_error_error_error_error_error_error string from deserializer cannot_determine_schema string from deserializer check string from deserializer schema string from deserializer url string from deserializer and string from deserializer literal string from deserializer Time taken: 1.15 seconds, Fetched: 7 row(s)
同樣的事情工作了--as-textfile
,其中如在--as-avrodatafile
情況下拋出錯誤。
引用了一些堆棧溢出但無法解析。任何想法?