2015-04-23 158 views
2

我正在使用Spark Streaming下載網頁並插入到Hbase中。我會遇到以下情況例外:Hbase KeyValue size too large

WARN scheduler.TaskSetManager: Lost task 13.1 in stage 21.0 (TID 121,test1.server): java.lang.IllegalArgumentException: KeyValue size too large 
    at org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:1378) 
    at org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:1364) 
    at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:974) 
    at org.apache.hadoop.hbase.client.HTable.put(HTable.java:941) 
    at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:126) 
    at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:87) 
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:1000) 
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:979) 
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) 
    at org.apache.spark.scheduler.Task.run(Task.scala:64) 
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
    at java.lang.Thread.run(Thread.java:745) 

我試圖增加hbase.client.keyvalue.maxsize,並設置hbase.client.keyvalue.maxsize = 0表示沒有任何限制。另外,我增加了hdfs.blocksize = 256M。但是當我重啓羣集時,我仍然遇到同樣的錯誤:keyvalue變大。 任何想法,請提前致謝!

回答

6

hbase.client.keyvalue.maxsize是客戶端屬性。您需要在客戶端節點的hbase-site.xml中設置此屬性。或者你可以在Configuration對象的代碼中設置這個屬性。 無需爲該屬性重新啓動HBase。

相關問題