2013-11-15 23 views

回答

2

您可以重寫映射器的運行方法,並且一旦迭代了映射循環10次,就可以從while循環中斷開。這會假設你的文件不splitable,否則你會得到每個分割第10行:

@Override 
public void run(Context context) throws IOException, InterruptedException { 
    setup(context); 

    int rows = 0; 
    while (context.nextKeyValue()) { 
    if (rows++ == 10) { 
     break; 
    } 

    map(context.getCurrentKey(), context.getCurrentValue(), context); 
    } 

    cleanup(context); 
} 
+0

我現在正在下面的錯誤:map(context.getCurrentKey(),context.getCurrentValue(),context); 錯誤 java.lang.ClassCastException:org.apache.hadoop.io.NullWritable無法轉換爲org.apache.hadoop.io.LongWritable –

+0

檢查地圖方法的簽名 - 它是否匹配Longwritable,Text? –

+0

嗨,我試過這個,但編譯器無法找到「上下文」和「設置()」我試圖導入org.apache.hadoop.mapreduce.Mapper。* -----我在hadoop 1.2 .1 – ishan3243

0

假設N = 10,那麼我們可以使用下面的代碼從文件讀取只有10條記錄爲以下:
line1
line2



line20

//mapper 
    class Mapcls extends Mapper<LongWritable, Text, Text, NullWritable> 
    { 
    public void run(Context con) throws IOException, InterruptedException 
    { 
     setup(con); 
     int rows = 0; 
     while(con.nextKeyValue()) 
     { 
      if(rows++ == 10) 
      { 
       break; 
      } 
      map(con.getCurrentKey(), con.getCurrentValue(), con); 
     } 

     cleanup(con); 
    } 

    public void map(LongWritable key, Text value, Context con) throws IOException, InterruptedException 
    { 
     con.write(value, NullWritable.get()); 
    } 
    } 


    //driver 
    public class Testjob extends Configured implements Tool 
    { 

    @Override 
    public int run(String[] args) throws Exception 
    { 
     Configuration conf = new Configuration(); 
     Job job = new Job(conf, "Test-job"); 
     job.setJobName("tst001"); 
     job.setJarByClass(getClass()); 

     job.setMapperClass(Mapcls.class); 
     job.setMapOutputKeyClass(Text.class); 
     job.setMapOutputValueClass(NullWritable.class); 

     FileInputFormat.addInputPath(job, new Path(args[0])); 
     FileOutputFormat.setOutputPath(job, new Path(args[1])); 

     return job.waitForCompletion(true) ? 0 : 1; 
     } 

     public static void main(String[] args) throws Exception 
     { 
     int rc = ToolRunner.run(new Configuration(), new Testjob(), args); 
     System.exit(rc); 
     } 
    } 

那麼輸出將是:
LINE1
line10
LINE2
line3中
LINE4
LINE5
LINE6
line7
line8
line9

相關問題