2014-03-06 32 views
0

我是新來的豬,我試圖分析JSON結構如下豬JsonLoader問題 - 不解析定製JSON正確

{"id1":197,"id2":[ 
    {"id3":"109.11.11.0","id4":"","id5":1391233948301}, 
    {"id3":"10.10.15.81","id4":"","id5":1313393100648}, 
    ... 
]} 

上述文件名爲jsonfile.txt

alias = load 'jsonfile.txt' using JsonLoader('id1:int,id2:[id3:chararray,id4:chararray,id5:chararray]'); 

這是我得到的錯誤。

錯誤org.apache.pig.tools.grunt.Grunt - 錯誤1200:不匹配輸入 'ID3' 期待RIGHT_BRACKET

你知道我可能是做錯了什麼?

+0

嘗試檢查整個JSON [這裏](http://jsonlint.com/)。也許,這只是最後的逗號。 – kirilloid

+0

我剛剛檢查了json的格式是否正確。 – user1386101

回答

1

您的JSON模式格式不正確。

爲複雜的數據類型的格式如下所示:

Tuple: enclosed by(), items separated by "," 
    Non-empty tuple: (item1,item2,item3) 
    Empty tuple is valid:() 
Bag: enclosed by {}, tuples separated by "," 
    Non-empty bag: {code}{(tuple1),(tuple2),(tuple3)}{code} 
    Empty bag is valid: {} 
Map: enclosed by [], items separated by ",", key and value separated by "#" 
    Non-empty map: [key1#value1,key2#value2] 
    Empty map is valid: [] 

來源:http://pig.apache.org/docs/r0.10.0/func.html#jsonloadstore

換句話說,[]是不陣列,他們關聯表(地圖)關鍵字符是「#」來分割鍵和值。嘗試使用元組(括號)代替。

'id1:int,id2:(id3:chararray,id4:chararray,id5:chararray)' 

OR

'id1:int,id2:{(id3:chararray,id4:chararray,id5:chararray)}' 

我無法測試它,從來沒有試圖豬,但根據文件,它應該只是罰款。

(基於以下的實施例)

a = load 'a.json' using JsonLoader('a0:int,a1:{(a10:int,a11:chararray)},a2:(a20:double,a21:bytearray),a3:[chararray]');