0
我想要查詢存儲在我的HDFS下面的JSON示例文件如何查詢結構陣列配置單元(get_json_object)或JSON SERDE
{
"tag1": "1.0",
"tag2": "blah",
"tag3": "blahblah",
"tag4": {
"tag4_1": [{
"tag4_1_1": [{
"tag4_1_1_1": {
"Addr": {
"Addr1": "blah",
"City": "City",
"StateProvCd": "NY",
"PostalCode": "99999"
}
}
"tag4_1_1_1": {
"Addr": {
"Addr1": "blah2",
"City": "City2",
"StateProvCd": "NY",
"PostalCode": "99999"
}
}
}
]
}
]
}
}
我用下面通過數據
創建外部表CREATE EXTERNAL TABLE DB.hv_table
(
tag1 string
, tag2 string
, tag3 string
, tag4 struct<tag4_1:ARRAY<struct<tag4_1_1:ARRAY<struct<tag4_1_1_1:struct<Addr
Addr1:string
, City:string
, StateProvCd:string
, PostalCode:string>>>>>>
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION 'HDFS/location';
理想情況下,我要查詢的數據,它將返回給我這樣:
select tag1, tag2, tag3, tag4(all data) from DB.hv_table;
有人可以提供我的我怎麼能查詢的例子,而不以下列方式寫它:
select tag1, tag2, tag3
, tag4.tag4_1[0].tag4_1_1[0].tag4_1_1_1.Addr.Addr1 as Addr1
, tag4.tag4_1[0].tag4_1_1[0].tag4_1_1_1.Addr.City as City
, tag4.tag4_1[0].tag4_1_1[0].tag4_1_1_1.Addr.StateProvCd as StateProvCd
, tag4.tag4_1[0].tag4_1_1[0].tag4_1_1_1.Addr.PostalCode as PostalCode
from DB.hv_table
最重要的,我想不定義數組元素的項目數。在我的例子中,我只能定位數組的第一個元素(tag4_1_1_1)。如果可能的話,我會針對一切。