JSON文件
查詢::無需使用外部jar文件:)
Select
three.version,
three.overrideable,
get_json_object(three.strategy,'$.scenario') as scenario,
get_json_object(three.strategy,'$.repairType') as repairType,
get_json_object(three.strategy,'$.rank') as rank ,
get_json_object(three.strategy,'$.notificationType') as notificationType
FROM
(
select s.version,s.overrideable,strategy
FROM
(
select two.version as version,
two.overrideable as overrideable ,
split(two.repairStrategies,"\\|") as rs_array
FROM
(
select one.version,
one.overrideable as overrideable,
regexp_replace(regexp_replace(one.repairStrategies,'\\[|\\]',''),'\\}\\,\\{','\\}\\|\\{') as repairStrategies
FROM (
Select get_json_object(helper_json.line,'$.version') as version,
get_json_object(helper_json.line,'$.channelOutcome.MG.overrideable') as overrideable ,
get_json_object(helper_json.line,'$.channelOutcome.MG.repairStrategies') as repairStrategies
FROM helper_json
)one
) two
) s LATERAL VIEW explode(s.rs_array) s AS strategy
) three;
其中helper_json具有以下模式。
hive (vijay)> describe helper_json;
OK
line string None
Time taken: 0.056 seconds, Fetched: 1 row(s)
hive (vijay)> select * from helper_json;
OK
{"channelOutcome":{"MG":{"repairStrategies":[{"scenario":"1","repairType":"ISR","rank":1,"notificationType":"Z5"},{"scenario":"1","repairType":"SER","rank":2,"notificationType":"NO"},{"scenario":"1","repairType":"ACC","rank":3,"notificationType":"Z5"},{"scenario":"1","repairType":"SWP","rank":4,"notificationType":"Z5"},{"scenario":"4","repairType":"RMS","rank":5,"notificationType":"Z8"}],"overrideable":false}},"keyValues":[],"version":2.3}
Time taken: 0.144 seconds, Fetched: 1 row(s)
hive (vijay)>
輸出::增加輸出的輸出看起來像什麼更多的瞭解。
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201503240233_5513, Tracking URL = http://dragon1:50030/jobdetails.jsp?jobid=job_201503250213_4613
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201503240233_5513
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2015-07-04 05:06:51,144 Stage-1 map = 0%, reduce = 0%
2015-07-04 05:06:56,178 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.5 sec
2015-07-04 05:06:57,184 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.5 sec
2015-07-04 05:06:58,191 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 1.5 sec
MapReduce Total cumulative CPU time: 1 seconds 500 msec
Ended Job = job_201503250213_4613
MapReduce Jobs Launched:
Job 0: Map: 1 Cumulative CPU: 1.5 sec HDFS Read: 667 HDFS Write: 105 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 500 msec
OK
version overrideable scenario repairtype rank notificationtype
2.3 false 1 ISR 1 Z5
2.3 false 1 SER 2 NO
2.3 false 1 ACC 3 Z5
2.3 false 1 SWP 4 Z5
2.3 false 4 RMS 5 Z8
Time taken: 15.831 seconds, Fetched: 5 row(s)
通過添加外部https://github.com/rcongiu/Hive-JSON-Serde,這可以在簡單中實現。請參考這些使用外部jar很容易解決的例子http://thornydev.blogspot.in/2013/07/querying-json-records-via-hive.html –