2016-09-29 121 views
0

我有一個數據庫表,其中有一列存儲JSON格式的字符串。字符串本身包含像數組一樣的多元素。每個元素包含多個鍵值對。某些值也可能包含多個鍵值對,例如下面的「地址」屬性。HIVE,如何從數組中獲取元素,元素本身也是數組

People table: 
    Col1  Col2 ..... info 
    aaa  bbb   see below 

對於列「信息」,它包含以下JSON格式字符串:

[{"name":"abc", 
    "address":{"street":"str1", "city":"c1"}, 
    "phone":"1234567" 
}, 
{"name":"def", 
    "address":{"street":"str2", "city":"c1", "county":"ct"}, 
    "phone":"7145895" 
} 
] 

我需要JSON字符串中獲取每個字段的單個值。我可以通過調用爆炸這樣做的所有領域,除了「地址」字段(),如下所示:

SELECT 
    get_json_object(person, '$.name') AS name, 
    get_json_object(person, '$.phone') AS phone, 
    get_json_object(person, '$.address') AS addr 
FROM people lateral view explode(split(regexp_replace(
     regexp_replace(info, '\\}\\,\\{', '\\}\\\\n\\{'), '\\[|\\]',''), '\\\\n')) 
     p as person; 

我的問題是如何讓「地址」字段中每個字段。 「地址」字段可以包含任意數量的鍵值對,我不能使用JSONSerDe。我正在考慮使用另一個爆炸()電話,但我無法讓它工作。有人可以請幫助。非常感謝。

回答

1

,您可以直接與

SELECT 
    get_json_object(person, '$.name') AS name, 
    get_json_object(person, '$.phone') AS phone, 
    get_json_object(person, '$.address.street') AS street, 
    get_json_object(person, '$.address.city') AS city, 
    get_json_object(person, '$.address.county') AS county,  
FROM people lateral view explode(split(regexp_replace(
    regexp_replace(info, '\\}\\,\\{', '\\}\\\\n\\{'), '\\[|\\]',''), '\\\\n')) 
    p as person; 
致電json_objects