把德語文本中的HBase表

我想通過做增加了德國串更新表如下： put'table:data_validation_test','58e1f4200f23e474ca2d7f3a','urlbody:data','Auslöser' 我得到掃描這個表是這樣的：把德語文本中的HBase表

scan 'table:data_validation_test' 
ROW         COLUMN+CELL                        
58e1f4200f23e474ca2d7f3a   column=urlbody:data, timestamp=1491215905923, value=Ausl\xC3\xB6ser          
58e1f4200f23e474ca2d7f3a   column=urlbody:id, timestamp=1491215697534, value=58e1f4200f23e474ca2d7f3a

我不能找到一種方法來設置hbase中的編碼字符串。如何獲取字符串到Hbase中？

來源

2017-04-03 Ravi Ranjan

這只是scan命令的輸出問題（get也是如此）。事實上，你的字符串是正確存儲的。

這是因爲ö（\xC3\xB6）在2個字節上編碼，而\xC3和\xB6不能顯示爲可讀字符。請記住，在HBase中，主要類型是Array[Byte]。

如果你試圖讓使用JRuby您的字符串值（HBase的內殼）：

include Java 
import org.apache.hadoop.hbase.HBaseConfiguration 
import org.apache.hadoop.hbase.client.HTable 
import org.apache.hadoop.hbase.client.Get 
import org.apache.hadoop.hbase.util.Bytes 

config = HBaseConfiguration.create 
htable = HTable.new(conf, 'table:data_validation_test') 
result = htable.get(Get.new('58e1f4200f23e474ca2d7f3a'.to_java_bytes)) 

puts Bytes.toString(result.getValue('urlbody'.to_java_bytes, 'data'.to_java_bytes))

然後，應正確顯示你的價值。

來源

2017-04-03 14:28:53 norbjd

把德語文本中的HBase表

回答

相關問題