蜂巢：JSON SERDE文件返回外部表

「NULL」我有一個包含一串與相關數據（用戶，位置等）的鳴叫上亞馬遜DynamoDB。我通過管道輸出了這個，並得到了一個json文件。其導出爲CSV文件將是一個糟糕的主意，因爲許多鳴叫包含在文本字段中的逗號。隨着新的蜂巢，因爲我，我至少知道，加載JSON文件，我需要某種形式的SERDE。蜂巢：JSON SERDE文件返回外部表

這是我如何創建表：

create external table tablename (
id string, 
created_at string, 
followers_count string, 
geo string, 
location string, 
polarity string, 
screen_name string, 
sentiment string, 
subjectivity string, 
tweet string, 
username string) 
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' 
SAVE AS TEXTFILE ;

我沒有得到任何錯誤，但後來我做的：

load data inpath '/user/exam' 
overwrite into table tablename;

（這是JSON文件的存儲位置）

當我做「select * from tablename limit 5;」一切都來了NULL：

hive> select * from wcd.tablename limit 5; 
OK 
{ NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 
{ NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 
{ NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 
{ NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 
{ NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL

如果有人想看看有問題的文件，它的網址爲：

http://www.vaughn-s.net/hadoop

任何援助將不勝感激！

來源

2017-08-08 lengthy_preamble

你可以把一些示例你的數據？ – hlagos

是的，有在這個職位底部的整個文件的鏈接;不過，如果您願意，我可以發佈snippits。 –

的原因是因爲你的JSON doesn't按照你的表定義

{"id":{"s":"894643473017561088"},"sentiment":{"s":"neutral"},"subjectivity":{"s":"0.0"},"username":{"s":"Jessi"},"geo":{"s":"None"},"location":{"s":"Valley of the sunâ˜€ï¸"},"polarity":{"s":"0.0"},"tweet":{"s":"b\"RT @bannerite: Donald Trump's lies have consequences. We're seeing them now | Charlotte Observer #DemForce https""},"created_at":{"s":"Mon Aug 07 19:36:40 
+0000 2017"},"screen_name":{"s":"JessiAtkins06"},"followers_count":{"s":"19"}}

儘量把每一列與作爲字符串結構，例如

id struct<s:string>

來源

2017-08-08 03:12:26 hlagos

我沒有注意到 - 謝謝你的提示！ –

蜂巢：JSON SERDE文件返回外部表

回答

相關問題