0
,我有以下XML數據:豬XMLLoader:無法解析XML(轉換爲CSV)
<CompactData><my:DataSet><my:Series VAL="A" AMOUNT_TYPE="FI" IDENTIFIER="1"><my:Obs AMT="24.25" UNIT_MEASURE="KG"></my:Obs></my:Series><my:Series VAL="B" AMOUNT_TYPE="GI" IDENTIFIER="2"><my:Obs AMT="21.22" UNIT_MEASURE="KG"></my:Obs></my:Series></my:DataSet></CompactData>
我想在PIG使用以下命令將其轉換爲一個CSV格式:
A = LOAD '/testing/mydata.xml' using org.apache.pig.piggybank.storage.XMLLoader('CompactData') as (x:chararray);
B = FOREACH A GENERATE FLATTEN(REGEX_EXTRACT_ALL(x,'<my:Series VAL="([^"]+)" AMOUNT_TYPE="([^"]+)" IDENTIFIER="([^"]+)"><my:Obs AMT="([^"]+)" UNIT_MEASURE="([^"]+)"></my:Obs></my:Series>')) AS (val:chararray,amount_type:chararray,identifier:chararray,amt:chararray,unit_measure:chararray);
把正則表達式<my:Series VAL="([^"]+)" AMOUNT_TYPE="([^"]+)" IDENTIFIER="([^"]+)"><my:Obs AMT="([^"]+)" UNIT_MEASURE="([^"]+)"><\/my:Obs><\/my:Series>
轉換成Regexr給出了兩個完美匹配,但是Pig只是不想和它一起工作。它總是給我一個空的結果,而我希望以下內容:
A,FI,1,24.25,KG
B,GI,2,21.22,KG
更新1:這似乎是最有可能與這裏的問題提到:Pig xmlloader error when loading tag with colon
這其實與這裏提到的同樣的問題有關:http://stackoverflow.com/questions/30939813/pig-xmlloader-error-when-loading-tag-with-colon – Syed