2014-02-28 16 views

回答

0

構建的MapReduce與XmlInputFormat(像這樣的https://github.com/apache/mahout/blob/ad84344e4055b1e6adff5779339a33fa29e1265d/examples/src/main/java/org/apache/mahout/classifier/bayes/XmlInputFormat.java),一個AvroKeyValueOutputFormat和你的映射器或地圖/機內部特定的業務邏輯:

job.setInputFormatClass(XmlInputFormat.class); 
job.setOutputFormatClass(AvroKeyValueOutputFormat.class); 
AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.STRING)); 
AvroJob.setOutputValueSchema(job, Schema.create(Schema.Type.INT));