作爲由「SSaikia_JtheRocker」映射器任務解釋,按照其在HDFS塊中的邏輯分割的總數創建。 我想添加一些問題#3「怎麼一個輸入文件映射器中分配?和一個映射器的輸出多個減速之間的分佈(這是由框架完成,或者你可以改變)?」 例如,考慮其計算如下所示在一個文件中的單詞數我的字計數程序:
#
公共類WCMapper擴展映射{
@Override
public void map(LongWritable key, Text value, Context context) // Context context is output
throws IOException, InterruptedException {
// value = "How Are You"
String line = value.toString(); // This is converting the Hadoop's "How Are you" to Java compatible "How Are You"
StringTokenizer tokenizer = new StringTokenizer (line); // StringTokenizer returns an array tokenizer = {"How", "Are", "You"}
while (tokenizer.hasMoreTokens()) // hasMoreTokens is a method in Java which returns boolean values 'True' or 'false'
{
value.set(tokenizer.nextToken()); // value's values are overwritten with "How"
context.write(value, new IntWritable(1)); // writing the current context to local disk
// How, 1
// Are, 1
// You, 1
// Mapper will run as many times as the number of lines
}
}
}
#
所以在上述方案中,行「你好」是的StringTokenizer分成3個字,並在while循環中使用這個時候,映射器被稱爲多次的單詞數,所以這裏3映射器調用。
,減速機,我們可以指定像我們多少減速要在使用中產生了輸出「job.setNumReduceTasks(5);」聲明。下面的代碼片段會給你一個想法。
#
公共類BooksMain {
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
// Use programArgs array to retrieve program arguments.
String[] programArgs = new GenericOptionsParser(conf, args)
.getRemainingArgs();
Job job = new Job(conf);
job.setJarByClass(BooksMain.class);
job.setMapperClass(BookMapper.class);
job.setReducerClass(BookReducer.class);
job.setNumReduceTasks(5);
// job.setCombinerClass(BookReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
// TODO: Update the input path for the location of the inputs of the map-reduce job.
FileInputFormat.addInputPath(job, new Path(programArgs[0]));
// TODO: Update the output path for the output directory of the map-reduce job.
FileOutputFormat.setOutputPath(job, new Path(programArgs[1]));
// Submit the job and wait for it to finish.
job.waitForCompletion(true);
// Submit and return immediately:
// job.submit();
}
}
#
@Fakhar,請打電話給我,如果事情還不清楚。 –