2013-05-02 75 views

回答

1

一些代碼,是未經測試,但應該做的伎倆(顯然的文件名是由 - 序列文件一般不具有擴展名):

Configuration conf = new Configuration(); 
FileSystem fs = FileSystem.get(conf); 

Path inputPath = new Path("part-r-00000.snappy"); 
Path outputPath = new Path("part-r-00000.deflate"); 
FSDataOutputStream dos = fs.create(outputPath); 

SequenceFile.Reader reader = new SequenceFile.Reader(fs, inputPath, 
     conf); 
Writable key = (Writable) ReflectionUtils.newInstance(
     reader.getKeyClass(), conf); 
Writable value = (Writable) ReflectionUtils.newInstance(
     reader.getValueClass(), conf); 

CompressionCodecFactory ccf = new CompressionCodecFactory(conf); 
CompressionCodec codec = ccf.getCodecByClassName(DefaultCodec.class 
     .getName()); 
SequenceFile.Writer writer = SequenceFile.createWriter(conf, dos, 
     key.getClass(), value.getClass(), reader.getCompressionType(), 
     codec); 

while (reader.next(key, value)) { 
    writer.append(key, value); 
} 

reader.close(); 
dos.close(); 

你也應該通過ToolRunner獲取配置/ Tool模式 - 這裏有一個類似的問題,概述其櫃面的新校長你: