2014-07-16 143 views
0

我希望能夠在MR作業的地圖階段設置某種變量或標誌,我可以在作業完成後檢查。我認爲最好的方式來證明什麼,我想用一些代碼:P.S我使用Hadoop 2.2.0MapReduce中的全局變量或屬性?

public class MRJob { 

    public static class MapperTest 
     extends Mapper<Object, Text, Text, IntWritable>{ 


    public void map(Object key, Text value, Context context 
        ) throws IOException, InterruptedException { 
     //Do some computation to get new value and key 
     ... 
     //Check if new value equal to some condition e.g if(value < 1) set global variable to true 

     context.write(newKey, newValue); 
    } 
    } 

    public static void main(String[] args) throws Exception { 
    Configuration conf = new Configuration(); 

    Job job = Job.getInstance(new Configuration(), "word_count"); 
    //set job configs 

    job.waitForCompletion(true); 

    //Here I want to be able to check if my global variable has been set to true by any one of the mappers 

    } 
} 

回答

2

使用Counter對這一問題。

public static enum UpdateCounter { 
    UPDATED 
} 

@Override 
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { 

    if(value < 1) { 
     context.getCounter(UpdateCounter.UPDATED).increment(1); 
    } 

    context.write(newKey, newValue); 
} 

作業後,您可以檢查:

Configuration conf = new Configuration(); 

Job job = Job.getInstance(new Configuration(), "word_count"); 
//set job configs 

job.waitForCompletion(true); 
long counter = job.getCounters().findCounter(UpdateCounter.UPDATED).getValue(); 

if(counter > 0) 
    // some mapper has seen the condition 
+0

感謝您的回答。爲了澄清一下,我是否在mapper類中創建了UpdateCounter? –

+0

@莫。只要該類可以在運行時訪問,那在何處並不重要。 –

+0

對不起,最後一個問題。我有一個稍微類似的用例,我需要設置一個自定義的全局值。 例如在mapper中是這樣的: ''map(key,....){ if(key ==「foo」){ globalVariable = value; } }'' 然後,當作業完成後,我需要訪問該變量,類似於計數器。 謝謝 –