使用MapReduce將數據批量插入HBase

我需要將4億行插入到HBase表中。使用MapReduce將數據批量插入HBase

架構看起來像這樣

在那裏，我通過簡單地串聯INT生成密鑰，int和值System.nanoTime（）

我的映射器看起來像這樣

public class DatasetMapper extends Tablemapper <Text,LongWritable> { 


    private static Configuration conf = HBaseConfiguration.create(); 


public void map (Text key, LongWritable values, Context context) throws exception { 

    // instantiate HTable object that connects to table name 
    HTable htable = new HTable(conf,"temp") // already created temp table 
    htable.setAutoFlush(flase); 
    htable.setWriteBufferSize(1024*1024*12); 

    // construct key 
    int i = 0, j = 0; 
    for(i=0; i<400000000,i++) { 
     String rowkey = Integer.toString(i).concat(Integer.toString(j)); 
     Long value = Math.abs(System.nanoTime()); 
     Put put = new Put(Bytes.toBytes(rowkey)); 
      put.add(Bytes.toBytes("location"),Bytes.toBytes("longlat"),Bytes.toBytes(value); 
     htable.put(put) 
     j++; 
     htable.flushCommits(); 
} 
}

和我的工作看起來像這樣

Configuration config = HBaseConfiguration.create(); 
Job job = new Job(config,"initdb"); 
job.setJarByClass(DatasetMapper.class); // class that contains mapper 

TableMapReduceUtil.initTableMapperJob(
null,  // input table 
null,    
DatabaseMapper.class, // mapper class 
null,    // mapper output key 
null,    // mapper output value 
job); 
TableMapReduceUtil.initTableReducerJob(
temp,  // output table 
null,    // reducer class 
job); 
job.setNumReduceTasks(0); 

boolean b = job.waitForCompletion(true); 
if (!b) { 
throw new IOException("error with job!"); 
}

工作運行，但我插入0條記錄。我知道我犯了一些錯誤，但我不能趕上它，因爲我是HBase的新手。請幫幫我。

感謝

來源

2013-07-26 Shashank.Kr

首先第一件事情，你的映射器的名字是DatasetMapper，但在你的工作配置已指定DatabaseMapper。我想知道它是如何工作沒有任何錯誤。

接下來，看起來您已將TableMapper和Mapper用法混合在一起。 Hbase TableMapper是一個抽象類，它擴展了Hadoop Mapper，幫助我們方便地從HBase中讀取數據，並且TableReducer可以幫助我們寫回HBase。您正嘗試從Mapper中提取數據，並且您同時使用TableReducer。你映射器實際上永遠不會被調用。

要麼使用TableReducer來放置數據，要麼只使用Mapper。如果你真的希望在你的Mapper中使用它，你可以使用TableOutputFormat類。請參閱HBase權威指南頁面301中給出的示例。這是Google Books link

HTH

附：：您可能會發現這些鏈接在正常學習HBase的+ MR整合有所幫助：

Link 1.

Link 2.

來源

2013-07-27 00:12:29 Tariq

嗨塔裏克感謝指出了這一點對我和有用的鏈接。我看到的所有示例都使用HBase作爲源和接收器，我對此感到困惑。讓我按照自己的方式做吧..將用正確的代碼發佈我的結果。 –

歡迎@ Shashank.Kr。當然。 – Tariq

使用MapReduce將數據批量插入HBase

回答

相關問題