給定一個名爲KeyLabelDistance的類,我將其作爲Hadoop中的鍵和值傳遞,我想對它進行二級排序,即我首先要根據鍵值的遞增值對鍵進行排序,然後按照DECREASING順序距離。如何實現hadoop的組比較器?
爲了做到這一點,我需要編寫自己的GroupingComparator.My問題是因爲setGroupingComparator()方法僅將一個擴展RawComparator的類作爲參數,我該如何在分組比較器中進行比較的字節?我是否需要顯式序列化和反序列化對象? 還有如下的類KeyLabelDistance實現WritableComparable需要一個SortComparator作爲冗餘嗎?
我得到了這個答案使用SortComparator和GroupComparator的:What are the differences between Sort Comparator and Group Comparator in Hadoop?
以下是KeyLabelDistance執行:
public class KeyLabelDistance implements WritableComparable<KeyLabelDistance>
{
private int key;
private int label;
private double distance;
KeyLabelDistance()
{
key = 0;
label = 0;
distance = 0;
}
KeyLabelDistance(int key, int label, double distance)
{
this.key = key;
this.label = label;
this.distance = distance;
}
public int getKey() {
return key;
}
public void setKey(int key) {
this.key = key;
}
public int getLabel() {
return label;
}
public void setLabel(int label) {
this.label = label;
}
public double getDistance() {
return distance;
}
public void setDistance(double distance) {
this.distance = distance;
}
public int compareTo(KeyLabelDistance lhs, KeyLabelDistance rhs)
{
if(lhs == rhs)
return 0;
else
{
if(lhs.getKey() < rhs.getKey())
return -1;
else if(lhs.getKey() > rhs.getKey())
return 1;
else
{
//If the keys are equal, look at the distances -> since more is the "distance" more is the "similarity", the comparison is counterintuitive
if(lhs.getDistance() < rhs.getDistance())
return 1;
else if(lhs.getDistance() > rhs.getDistance())
return -1;
else return 0;
}
}
}
}
該組比較器的代碼如下:
public class KeyLabelDistanceGroupingComparator extends WritableComparator{
public int compare (KeyLabelDistance lhs, KeyLabelDistance rhs)
{
if(lhs == rhs)
return 0;
else
{
if(lhs.getKey() < rhs.getKey())
return -1;
else if(lhs.getKey() > rhs.getKey())
return 1;
return 0;
}
}
}
任何幫助表示讚賞。提前感謝。
謝謝,我試過了。現在我已經包含組比較代碼也在我的問題,但我得到以下錯誤: KeyLabelDistanceGroupingComparator.java:3:找不到符號 符號:構造函數WritableComparator() 位置:類org.apache.hadoop.io.WritableComparator – user3377770
當tou在java中擴展一個類並且超類沒有默認構造函數時,這就是你得到的錯誤。在代碼中創建構造函數並調用super()。例如:\t XYZKeyValueComparator(){ \t \t super(MyWritable.class,true); \t} – Venkat