2014-03-04 51 views
3

我試圖建立以下JSON的Avro的模式:創建的Avro架構簡單的JSON

{ 
    "id":1234, 
    "my_name_field": "my_name", 
    "extra_data": { 
     "my_long_value": 1234567890, 
     "my_message_string": "Hello World!", 
     "my_int_value": 777, 
     "some_new_field": 1 
    } 
} 

的「身份證」和「my_name_field」已知的,但價值「extra_data領域'動態變化,而且是未知的。

的Avro的模式我腦子裏想的是:

{ 
    "name":"my_record", 
    "type":"record", 
    "fields":[ 
     {"name":"id", "type":"int", "default":0}, 
     {"name":"my_name_field", "type":"string", "default":"NoName"}, 
     { "name":"extra_data", "type":{"type":"map", "values":["null","int","long","string"]}  }   
    ] 
} 

我最初的想法是讓「extra_data」與地圖的記錄,但這並不工作:

{ "name":"extra_data", "type":{"type":"map", "values":["null","int","long","string"]} } 

我得到:

AvroTypeException: Expected start-union. Got VALUE_NUMBER_INT 

阿帕奇給出https://cwiki.apache.org/confluence/display/Hive/AvroSerDe一些很好的例子,但沒有人可以做的工作。

這是單元測試我運行檢查:

公共類AvroTest {

@Test 
public void readRecord() throws IOException { 

    String event="{\"id\":1234,\"my_name_field\":\"my_name\",\"extra_data\":{\"my_long_value\":1234567890,\"my_message_string\":\"Hello World!\",\"my_int_value\":777,\"some_new_field\":1}}"; 

    SchemaRegistry<Schema> registry = new com.linkedin.camus.schema.MySchemaRegistry(); 
    DecoderFactory decoderFactory = DecoderFactory.get(); 

    ObjectMapper mapper = new ObjectMapper(); 
    GenericDatumReader<GenericData.Record> reader = new GenericDatumReader<GenericData.Record>(); 
    Schema schema = registry.getLatestSchemaByTopic("record_topic").getSchema(); 
    reader.setSchema(schema); 

    HashMap hashMap = mapper.readValue(event, HashMap.class); 
    long now = Long.valueOf(hashMap.get("now").toString())*1000; 
    GenericData.Record read = reader.read(null, decoderFactory.jsonDecoder(schema, event)); 
} 

將不勝感激解決這個問題, 感謝。

回答

0

如果額外的數據字段列表確實是未知的使用多個可選值字段可能會有幫助,這樣的:

{ 
    "name":"my_record", 
    "type":"record", 
    "fields":[ 
     {"name":"id", "type":"int", "default":0}, 
     {"name":"my_name_field", "type":"string", "default":"NoName"}, 
     {"name":"extra_data", "type": "array", "items": { 
      {"name": "extra_data_entry", "type":"record", "fields": [ 
       {"name":"extra_data_field_name", "type": "string"}, 
       {"name":"extra_data_field_type", "type": "string"}, 
       {"name":"extra_data_field_value_string", "type": ["null", "string"]}, 
       {"name":"extra_data_field_value_int", "type": ["null", "int"]}, 
       {"name":"extra_data_field_value_long", "type": ["null", "long"]} 
      ]} 
     }} 
    ] 
} 

然後你可以選擇基於該extra_data_field_type該字段的值extra_data_field_value_*