`公共類GenericUdafMemberLevel實現GenericUDAFResolver2私有靜態最終日誌LOG = LogFactory .getLog(GenericUdafMemberLevel.class.getName());開發Hive UDAF遇到ClassCastException沒有想法
@Override
public GenericUDAFEvaluator getEvaluator(GenericUDAFParameterInfo paramInfo)
throws SemanticException {
return new GenericUdafMeberLevelEvaluator();
}
@Override
//參數校驗
public GenericUDAFEvaluator getEvaluator(TypeInfo[] parameters)
throws SemanticException {
if (parameters.length != 2) {//參數大小
throw new UDFArgumentTypeException(parameters.length - 1,
"Exactly two arguments are expected.");
}
//參數必須是原型,即不能是
if (parameters[0].getCategory() != ObjectInspector.Category.PRIMITIVE) {
throw new UDFArgumentTypeException(0,
"Only primitive type arguments are accepted but "
+ parameters[0].getTypeName() + " is passed.");
}
if (parameters[1].getCategory() != ObjectInspector.Category.PRIMITIVE) {
throw new UDFArgumentTypeException(1,
"Only primitive type arguments are accepted but "
+ parameters[1].getTypeName() + " is passed.");
}
return new GenericUdafMeberLevelEvaluator();
}
public static class GenericUdafMeberLevelEvaluator extends GenericUDAFEvaluator {
private PrimitiveObjectInspector inputOI;
private PrimitiveObjectInspector inputOI2;
private DoubleWritable result;
@Override
public ObjectInspector init(Mode m, ObjectInspector[] parameters)
throws HiveException {
super.init(m, parameters);
if (m == Mode.PARTIAL1 || m == Mode.COMPLETE){
inputOI = (PrimitiveObjectInspector) parameters[0];
inputOI2 = (PrimitiveObjectInspector) parameters[1];
result = new DoubleWritable(0);
}
return PrimitiveObjectInspectorFactory.writableLongObjectInspector;
}
/** class for storing count value. */
static class SumAgg implements AggregationBuffer {
boolean empty;
double value;
}
@Override
//創建新的聚合計算的需要的內存,用來存儲mapper,combiner,reducer運算過程中的相加總和。
//使用buffer對象前,先進行內存的清空——reset
public AggregationBuffer getNewAggregationBuffer() throws HiveException {
SumAgg buffer = new SumAgg();
reset(buffer);
return buffer;
}
@Override
//重置爲0
//mapreduce支持mapper和reducer的重用,所以爲了兼容,也需要做內存的重用。
public void reset(AggregationBuffer agg) throws HiveException {
((SumAgg) agg).value = 0.0;
((SumAgg) agg).empty = true;
}
private boolean warned = false;
//迭代
//map階段調用,只要把保存當前和的對象agg,再加上輸入的參數,就可以了。
@Override
public void iterate(AggregationBuffer agg, Object[] parameters)
throws HiveException {
// parameters == null means the input table/split is empty
if (parameters == null) {
return;
}
try {
double flag = PrimitiveObjectInspectorUtils.getDouble(parameters[1], inputOI2);
if(flag > 1.0) //參數條件
merge(agg, parameters[0]); //這裏將Map之後的操作,放入combiner進行合併
} catch (NumberFormatException e) {
if (!warned) {
warned = true;
LOG.warn(getClass().getSimpleName() + " "
+ StringUtils.stringifyException(e));
}
}
}
@Override
//combiner合併map返回的結果,還有reducer合併mapper或combiner返回的結果。
public void merge(AggregationBuffer agg, Object partial)
throws HiveException {
if (partial != null) {
//通過ObejctInspector取每一個字段的數據
double p = PrimitiveObjectInspectorUtils.getDouble(partial, inputOI);
((SumAgg) agg).value += p;
}
}
@Override
//reducer返回結果,或者是隻有mapper,沒有reducer時,在mapper端返回結果。
public Object terminatePartial(AggregationBuffer agg)
throws HiveException {
return terminate(agg);
}
@Override
public Object terminate(AggregationBuffer agg) throws HiveException {
result.set(((SumAgg) agg).value);
return result;
}
}
}`
我已經使用了一些中國評論的代碼對於理解理論。實際上,UDAF的想法如下: 從tbl中選擇test_sum(col1,col2);如果col2滿足某些條件,則和col1的值相加。 大部分代碼都是從官方的avg()udaf函數中複製而來的。
我遇到了一個weried例外: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.DoubleWritable cannot be cast to org.apache.hadoop.io.LongWritable at org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1132) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:558) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567) at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193) ... 8 more Caused by: java.lang.ClassCastException: org.apache.hadoop.io.DoubleWritable cannot be cast to org.apache.hadoop.io.LongWritable at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableLongObjectInspector.get(WritableLongObjectInspector.java:35) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:323) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:255) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:202) at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:236) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1061) at org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1113) ... 13 more
上午我和我的UDAF什麼不對? 請親切指出。 謝謝你。
我已經修復了typo.But,但是當執行UDAF時我有一個NullPointerException異常,你可以在init()部分給出關於UDAF的例子 – user2114243 2013-03-08 01:31:24