1
我在hbase中有一個名爲「sample」的表。我需要使用Apache spark-sql查詢來查詢表。 有什麼方法可以使用Apache spark-sql查詢來讀取hbase數據嗎?使用spark-sql的Hbase
我在hbase中有一個名爲「sample」的表。我需要使用Apache spark-sql查詢來查詢表。 有什麼方法可以使用Apache spark-sql查詢來讀取hbase數據嗎?使用spark-sql的Hbase
星火SQL是在內存中的查詢引擎,利用星火SQL對HBase的桌子上執行一些查詢操作,你需要
使用星火從HBase的獲取數據,並創建星火RDD
SparkConf sparkConf = new SparkConf();
sparkConf.setAppName("SparkApp");
sparkConf.setMaster("local[*]");
JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf);
Configuration config = HBaseConfiguration.create();
config.addResource(new Path("/etc/hbase/hbase-site.xml"));
config.addResource(new Path("/etc/hadoop/core-site.xml"));
config.set(TableInputFormat.INPUT_TABLE, "sample");
JavaPairRDD<ImmutableBytesWritable, Result> hbaseRDD = javaSparkContext.newAPIHadoopRDD(hbaseConfig, TableInputFormat.class, ImmutableBytesWritable.class, Result.class);
JavaRDD<StudentBean> sampleRDD = hbaseRDD.map(new Function<Tuple2<ImmutableBytesWritable,Result>, StudentBean >() {
private static final long serialVersionUID = -2021713021648730786L;
public StudentBean call(Tuple2<ImmutableBytesWritable, Result> tuple) {
StudentBean bean = new StudentBean ();
Result result = tuple._2;
bean.setRowKey(rowKey);
bean.setFirstName(Bytes.toString(result.getValue(Bytes.toBytes("details"), Bytes.toBytes("firstName"))));
bean.setLastName(Bytes.toString(result.getValue(Bytes.toBytes("details"), Bytes.toBytes("lastName"))));
bean.setBranch(Bytes.toString(result.getValue(Bytes.toBytes("details"), Bytes.toBytes("branch"))));
bean.setEmailId(Bytes.toString(result.getValue(Bytes.toBytes("details"), Bytes.toBytes("emailId"))));
return bean;
}
});
通過使用此RDD創建數據框對象,並使用一些臨時表的名字註冊這一點,那麼你就可以執行查詢
DataFrame schema = sqlContext.createDataFrame(sampleRDD, StudentBean.class);
schema.registerTempTable("spark_sql_temp_table");
DataFrame schemaRDD = sqlContext.sql("YOUR_QUERY_GOES_HERE");
JavaRDD<StudentBean> result = schemaRDD.toJavaRDD().map(new Function<Row, StudentBean>() {
private static final long serialVersionUID = -2558736294883522519L;
public StudentBean call(Row row) throws Exception {
StudentBean bean = new StudentBean();
// Do the mapping stuff here
return bean;
}
});