組合器和減速器可以不同？

22

是的，組合器可以與Reducer不同，但Combiner仍然會實現Reducer接口。合併器只能用於需要依賴工作的特定情況。 Combiner將像Reducer一樣運行，但只能在每個Mapper的鍵/值輸出的子集上運行。

與Reducer不同，Combiner具有的一個約束是輸入/輸出鍵和值類型必須匹配Mapper的輸出類型。

2012-07-31 03:06:13

+1

能合成器被描述爲本地減速？ – user1170330 2015-05-28 01:09:50

+1

在這個鏈接真的有用的東西。它解釋了何時應該使用組合器http://www.philippeadjiman.com/blog/2010/01/14/hadoop-tutorial-series-issue-4-to-use-or-not-to-use-a-combiner / – mk2 2016-04-26 21:26:43

7

是的，他們肯定會有所不同，但我不認爲你想要使用不同的班級，大多數情況下你會得到意想不到的結果。

合併器只能用於可交換函數（a.b = b.a）和關聯{a。（b.c）=（a.b）.c}。這也意味着組合器只能在你的鍵和值的一個子集上運行，或者根本不能運行，你仍然希望程序的輸出保持不變。

選擇具有不同邏輯的不同類可能不會給你一個邏輯輸出。

來源

2012-07-31 02:11:13

+0

這應該是公認的答案 – mk2 2016-04-26 21:29:39

0

組合器的主要目標是優化/最小化將在映射器和縮減器之間跨網絡混洗的鍵值對的數量，從而儘可能節省最多的帶寬。

組合器的經驗法則是它必須有相同的輸入和輸出變量的類型，這樣做的原因，是組合使用不能保證，它可以或不可以被使用，這取決於體積和數量泄漏。

當減速機滿足該規則，即相同的輸入和輸出變量類型時，可用作組合器。

組合器的另一個最重要的規則是它只能用於你想要應用的功能是可交換和關聯的。比如添加數字。但不是像平均值（如果您使用與縮減器相同的代碼）。

現在回答你的問題，是的，當然他們可以是不同的，當你的減速機有不同類型的輸入和輸出變量，你別無選擇，只能做一個不同的減速機代碼副本並修改它。

如果你關心reducer的邏輯，你也可以以不同的方式實現，比如說在組合器的情況下，你可以讓一個集合對象擁有一個本地緩衝區，其中包含所有值的合併器，這比在減速器中使用它的風險要小，因爲在減速器的情況下，它比組合器更容易出現內存不足。其他的邏輯差異當然可以存在和確實。

來源

2016-04-08 17:54:40 user3123372

2

這是實現，你可以運行沒有組合器和組合器，都給出了完全相同的答案。這裏Reducer和Combiner有不同的動機和不同的實現。

package combiner; 

import java.io.IOException; 


import org.apache.hadoop.io.LongWritable; 
import org.apache.hadoop.io.Text; 
import org.apache.hadoop.mapreduce.Mapper; 

public class Map extends Mapper<LongWritable, Text, Text, Average> { 

Text name = new Text(); 
String[] row; 

protected void map(LongWritable offSet, Text line, Context context) throws IOException, InterruptedException { 
    row = line.toString().split(" "); 
    System.out.println("Key "+row[0]+"Value "+row[1]); 
    name.set(row[0]); 
    context.write(name, new Average(Integer.parseInt(row[1].toString()), 1)); 
}}

減少類

public class Reduce extends Reducer<Text, Average, Text, LongWritable> { 
    LongWritable avg =new LongWritable(); 
    protected void reduce(Text key, Iterable<Average> val, Context context)throws IOException, InterruptedException { 
    int total=0; int count=0; long avgg=0; 

    for (Average value : val){ 
     total+=value.number*value.count; 
     count+=value.count; 
     avgg=total/count; 
     } 
    avg.set(avgg); 
    context.write(key, avg); 
} 
}

MapObject的類

public class Average implements Writable { 

long number; 
int count; 

public Average() {super();} 

public Average(long number, int count) { 
    this.number = number; 
    this.count = count; 
} 

public long getNumber() {return number;} 
public void setNumber(long number) {this.number = number;} 
public int getCount() {return count;} 
public void setCount(int count) {this.count = count;} 

@Override 
public void readFields(DataInput dataInput) throws IOException { 
    number = WritableUtils.readVLong(dataInput); 
    count = WritableUtils.readVInt(dataInput);  
} 

@Override 
public void write(DataOutput dataOutput) throws IOException { 
    WritableUtils.writeVLong(dataOutput, number); 
    WritableUtils.writeVInt(dataOutput, count); 

} 
}

合類

public class Combine extends Reducer<Text, Average, Text, Average>{ 

protected void reduce(Text name, Iterable<Average> val, Context context)throws IOException, InterruptedException { 
    int total=0; int count=0; long avg=0; 

    for (Average value : val){ 
     total+=value.number; 
     count+=1; 
     avg=total/count;  
     } 
    context.write(name, new Average(avg, count)); 

} 
}

驅動程序類

public class Driver1 { 

public static void main(String[] args) throws Exception { 

    Configuration conf = new Configuration(); 
    if (args.length != 2) { 
     System.err.println("Usage: SecondarySort <in> <out>"); 
     System.exit(2); 
    } 
    Job job = new Job(conf, "CustomCobiner"); 
    job.setJarByClass(Driver1.class); 
    job.setMapperClass(Map.class); 
    job.setCombinerClass(Combine.class); 
    job.setMapOutputKeyClass(Text.class); 
    job.setMapOutputValueClass(Average.class); 
    job.setReducerClass(Reduce.class); 
    job.setOutputKeyClass(Text.class); 
    job.setOutputValueClass(IntWritable.class);  
    FileInputFormat.addInputPath(job, new Path(args[0])); 
    FileOutputFormat.setOutputPath(job, new Path(args[1])); 
    System.exit(job.waitForCompletion(true) ? 0 : 1); 
} 
}

的Git從here

代碼離開烏爾建議..

來源

2016-04-09 20:44:28 user3123372

組合器和減速器可以不同？

回答

相關問題