0
我正在嘗試編寫計算Hive表(Hadoop 2.2.0.2.0.6.0-101)中字段值分佈的map-reduce作業。例如:MapReduce在配置表使用HCatalog
輸入蜂巢表「ATable」:
+------+--------+
! name | rating | |
+------+--------+
| Bond | 7 |
| Megre| 2 |
! Holms| 11 |
| Puaro| 7 |
! Holms| 1 |
| Puaro| 7 |
| Megre| 2 |
| Puaro| 7 |
+------+--------+
地圖,減少工作也應產生在蜂巢下面的輸出表:
+--------+-------+--------+
| Field | Value | Count |
+--------+-------+--------+
| name | Bond | 1 |
| name | Puaro | 3 |
| name | Megre | 2 |
| name | Holms | 1 |
| rating | 7 | 4 |
| rating | 11 | 1 |
| rating | 1 | 1 |
| rating | 2 | 2 |
+--------+-------+--------+
要獲得字段名/值我需要以獲得訪問HCatalog元數據,所以我可以使用這些地圖方法(org.apache.hadoop.mapreduce.Mapper) 對此我試圖採用示例: http://java.dzone.com/articles/mapreduce-hive-tables-using
從這個例子中的代碼編譯,但產生大量的廢棄警告:
protected void map(WritableComparable key, HCatRecord value,
org.apache.hadoop.mapreduce.Mapper.Context context)
throws IOException, InterruptedException {
// Get table schema
HCatSchema schema = HCatBaseInputFormat.getTableSchema(context);
Integer year = new Integer(value.getString("year", schema));
Integer month = new Integer(value.getString("month", schema));
Integer DayofMonth = value.getInteger("dayofmonth", schema);
context.write(new IntWritable(month), new IntWritable(DayofMonth));
}
廢棄警告:與最新
HCatRecord
HCatSchema
HCatBaseInputFormat.getTableSchema
哪裏可以找到使用HCatalog的一個類似的例子的map-reduce,不不推薦使用的接口?
謝謝!