-1
我想寫一個mapreduce程序,它說要找到每個售出的電視機的發生。 I/P EX- 三星|的Optima | 14 |中央邦| 132401 | 14200 賣onida |清醒| 18 |北方邦| 232401 | 16200 赤|體面| 16 |喀拉拉邦| 922401 | 12200 熔岩|注意| 20 |阿薩姆邦| 454601 | 24200 禪|超級| 14 |馬哈拉施特拉邦| 619082 | 9200如何從mapreduce中的文本文件拆分字符串(|)?
下面是我所須寫上 Mapper-
public class TotalUnitMapper extends Mapper<LongWritable,Text,Text,IntWritable> {
Text tvname;
//IntWritable unit;
public void setup(Context context){
tvname = new Text();
// unit = new IntWritable();
}
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException{
String[] lineArray2 = value.toString().split("|");
if(!lineArray2[0].contains("NA") || (!lineArray2[1].contains("NA"))){
tvname.set((lineArray2[0]));
IntWritable unit = new IntWritable(1);
context.write(tvname,unit);
}
}}
Reducer-映射精簡代碼 公共類TotalUnitReducer擴展減速{
public void reduce(Text tvname, Iterable<IntWritable> values, Context context)
throws IOException,InterruptedException{
int sum = 0;
for (IntWritable value : values){
sum+= value.get();
}
context.write(tvname, new IntWritable(sum));
}}
驅動程序 -
public class TotalUnit {
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "Assignment 3.3-2");
job.setJarByClass(TotalUnit.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(TotalUnitMapper.class);
job.setReducerClass(TotalUnitReducer.class);
job.setNumReduceTasks(2);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job,new Path(args[1]));
job.waitForCompletion(true);
}}
但是我得到O/P爲這 -
A 1
O 4
S 7
L 3
N 1
Z 2
越來越印刷只有電視名稱的第一個字母,我不知道爲什麼。 Split有什麼問題嗎? 請大家幫忙,因爲我是Hadoop的初學者。 在此先感謝。