2017-06-09 20 views
0

我正在嘗試生成JF對象給Kafka並手動使用它們,我使用的是org.apache.kafka.streams.examples.pageview中的JSONPOJO Serdes。從Kafka使用JSON對象時的序列化錯誤

我的製片代碼:

package JsonProducer; 

imports ... 

public class jsnPdc { 

    public static void main(String[] args) throws IOException { 

     byte[] arr= "XXXX  THIS IS TEST DATA \n XYZ".getBytes();  
     JSONObject jsn = new JSONObject(); 
     jsn.put("Header_Title", (Arrays.copyOfRange(arr, 0, 4))); 
     jsn.put("Data_Part", (Arrays.copyOfRange(arr, 4, arr.length))); 


     Properties props = new Properties(); 
     props.put("bootstrap.servers", "xxxxxxxxxxxxxxxxxxxxx:xxxx"); 
     props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer"); 
     props.put("value.serializer", "org.apache.kafka.streams.examples.pageview.JsonPOJOSerializer"); 

     KafkaProducer<String, JSONObject> pdc = new KafkaProducer<>(props); 
     pdc.send(new ProducerRecord<String,JSONObject>("testoutput", jsn)); 

     System.in.read(); 


    } 

} 

和消費者的代碼是:

package testConsumer; 

imports ... 

public class consumer_0 { 
    static public void main(String[] argv) throws ParseException { 

     //Configuration 
     Properties props = new Properties(); 
     props.put("bootstrap.servers", "xxxxxxxxxxxxxxxxxxx:xxxx"); 
     props.put("group.id", "test"); 
     props.put("enable.auto.commit", "false"); 
     props.put("auto.commit.interval.ms", "1000"); 
     props.put("session.timeout.ms", "30000"); 
     props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); 
     props.put("value.deserializer", "org.apache.kafka.streams.examples.pageview.JsonPOJODeserializer"); 


     //Create Consumer Object 
     KafkaConsumer<String, JSONObject> consumer = new KafkaConsumer<String, JSONObject>(props); 
     consumer.subscribe(Arrays.asList("testoutput")); 


     //Keep Polling Records 
     System.out.println("Polling new record...\n"); 
     while (true) { 
      ConsumerRecords<String, JSONObject> records = consumer.poll(100); 

      //Print Each Record 
      for (ConsumerRecord<String, JSONObject> record : records){ 
       JSONObject json = record.value(); 

       //Some print code, print(json) ... 

      } 
     } 
    } 
} 

而且我得到這個問題:

Exception in thread "main" org.apache.kafka.common.errors.SerializationException: Error deserializing key/value for partition testoutput-0 at offset 20491 
Caused by: org.apache.kafka.common.errors.SerializationException: java.lang.IllegalArgumentException: Unrecognized Type: [null] 
Caused by: java.lang.IllegalArgumentException: Unrecognized Type: [null] 
    at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1170) 
    at com.fasterxml.jackson.databind.type.TypeFactory.constructType(TypeFactory.java:618) 
    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2929) 
    at org.apache.kafka.streams.examples.pageview.JsonPOJODeserializer.deserialize(JsonPOJODeserializer.java:49) 
    at org.apache.kafka.clients.consumer.internals.Fetcher.parseRecord(Fetcher.java:882) 
    at org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:788) 
    at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:480) 
    at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1061) 
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995) 
    at testConsumer.consumer_0.main(consumer_0.java:43) 

我需要的值字段類型json在字節數組中。任何想法爲什麼發生這種情況?

回答

0

你誤會了誰將序列化值的責任。你告訴卡夫卡序列化,你給它使用org.apache.kafka.streams.examples.pageview.JsonPOJOSerializer,這是期待一個普通的java對象,像

class Data { 
    private String headerTitle; 
    private String dataPart; 
    //... constructors, getters, setters 
} 

的值,但你實際上已經通過了JSONObjectProducerRecord(換句話說,你已經已經將數據自己序列化,然後再將它提供給Kafka,然後再試圖序列化它再次)。

您可以系列化你jsn淡水河谷自己,但使用org.apache.kafka.common.serialization.StringDeserializer作爲value.serializer,或者你可以用unsing的org.apache.kafka.streams.examples.pageview.JsonPOJOSerializer STRICK並定義像Data以上的一類,並通過該類的interance到ProducerRecord

+0

謝謝!現在我終於明白髮生了什麼,但我已經通過將我的整個jsn對象序列化爲byte [],然後在我的客戶端將其反序列化爲byte [],從而解決了此問題。兩者都使用「org.apache.kafka.common.serialization.ByteArraySerde」。你認爲這會影響我的流/消費比使用POJO和JsonPOJOSerde的性能? – zzlyn

+0

如果你正在處理一個複雜的數據樹,那麼POJO會爲更容易理解的代碼做準備。我猜想使用jsn路由會更有效率(因爲Jackson'ObjectMapper'通過反射工作),但是如果性能是一個重要的需求,那麼你應該分析哪個更快。 –