2016-12-14 57 views
-1

可以想像,如果我有像下面如何從CSV數據映射到一個嵌套的Avro模式

{ 
    "name": "phoneNumber", 
    "type": { 
     "type": "record", 
     "name": "internalNumber", 
     "namespace": "com.wiki", 
     "fields": [{ 
     "name": "areacode", 
     "type": "string", 
     }, { 
     "name": "phone", 
     "type": ["null", "string"], 
     "doc": "Acutal full number", 
     "default": null 
     }] 
    } 
    } 

和我有了這個數據分散到多個列如CSV架構:

areaCode phoneNumber 
+1  1234512345 

我從豬腳本如何才能像一個Avro的文件:

"phoneNumber" : { 
"areacode" : "+1", 
    "phone" : "1234512345" 
} 

自嵌套。

回答

0
A = LOAD 'path' USING CSVLoader as (areaCode: chararray, phoneNumber: chararray); 
B = foreach A generate (areaCode, phoneNumber as phone) as phoneNumber; 
STORE B INTO 'path' using AvroStorage; 

你需要csvloader和avrostorage從撲滿