1
我fluming二進制對象HDFS和有我的水槽劑和水槽的設置是這樣讀二進制的Avro豬
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = /user/%y-%m-%d/%H%M/%S
a1.sinks.k1.hdfs.filePrefix = events-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.serializer = avro_event
a1.sinks.k1.hdfs.serializer.syncIntervalBytes = 4096000
a1.sinks.k1.hdfs.serializer.compressionCodec = snappy
a1.sinks.k1.hdfs.serializer.appendNewline = false
a1.sinks.k1.hdfs.fileSuffix=.avro
a1.sinks.k1.hdfs.writeFormat=TEXT
現在我想讀取HDFS文件(something.avro)使用這種
data = LOAD 'something.avro'
USING org.apache.pig.piggybank.storage.avro.AvroStorage();
dump data;
我不斷獲取此異常,任何想法,爲什麼我收到該異常或有另一種方式來讀取豬腳本二進制的Avro對象而不提供的Avro架構
Caused by: java.io.IOException: Not a data file.
at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:105)
at org.apache.avro.file.DataFileStream.<init>(DataFileStream.java:84)
at org.apache.pig.piggybank.storage.avro.AvroStorageUtils.getSchema(AvroStorageUtils.java:718)
at org.apache.pig.piggybank.storage.avro.AvroStorage.getSchema(AvroStorage.java:349)
at org.apache.pig.piggybank.storage.avro.AvroStorage.getAvroSchema(AvroStorage.java:277)
at org.apache.pig.piggybank.storage.avro.AvroStorage.getAvroSchema(AvroStorage.java:248)
at org.apache.pig.piggybank.storage.avro.AvroStorage.setInputAvroSchema(AvroStorage.java:226)
at org.apache.pig.piggybank.storage.avro.AvroStorage.getSchema(AvroStorage.java:434)
at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175)