0
我需要改進我的MR工作,我想到的一件事是實現一個定製的rawComparator,但是我的關鍵類有很多字段作爲字符串,除了一些int字段,我不知道如何解析出字符串字段出的byte [],實現定製rawcomparator
我的鑰匙類的
public GeneralKey {
private int day;
private int hour;
private String type;
private String name;
..
}
我定製rawComparator:
public class GeneralKeyComparator extends WritableComparator {
private static final Text.Comparator TEXT_COMPARATOR = new Text.Comparator();
protected GeneralKeyComparator() {
super(GeneralKey.class);
}
@Override
public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
int day1 = readInt(b1, s1);
int day2 = readInt(b2, s2);
int comp = (intDay1 < intDay2) ? -1 : (intDay1 == intDay2) ? 0 : 1;
if (0 != comp) {
return comp;
}
int hr1 = readInt(b1, s1+4);
int hr2 = readInt(b2, s2+4);
comp = (hr1 < hr2) ? -1 : (hr1 == hr2) ? 0 : 1;
.... how to compare the String fields here???
return comp;
}
我身邊谷歌發現人試圖此:
try {
int firstL1 = WritableUtils.decodeVIntSize(b1[s1]) + readInt(b1, s1+8);
int firstL2 = WritableUtils.decodeVIntSize(b2[s2]) + readVInt(b2, s2+8);
comp = TEXT_COMPARATOR.compare(b1, s1, firstL1, b2, s2, firstL2);
} catch (IOException e) {
throw new IllegalArgumentException(e);
}
但我不明白這項工作如何,不認爲它適用於我的情況,任何人都可以提供幫助嗎?感謝
加入readField()和write()方法,在這裏:
public void readFields(DataInput input) throws IOException {
intDay = input.readInt();
hr = input.readInt();
type = input.readUTF();
name = input.readUTF();
...
}
@Override
public void write(DataOutput output) throws IOException {
output.writeInt(intDay);
output.writeInt(hr);
output.writeUTF(type);
output.writeUTF(name);
...
}
您可以在GeneralKey中粘貼'readFields'和'write'方法嗎?如何比較GeneralKey取決於如何將GeneralKey寫入二進制文件。 – zsxwing