如何使用dataframe(spark sql)從Hbase獲取數據而不使用javaRDD。 代碼: -如何通過數據框從hbase獲取數據而不使用javaRDD
SparkConf sconf = new SparkConf().setMaster("local").setAppName("Test");
Configuration conf = HBaseConfiguration.create();
JavaSparkContext jsc = new JavaSparkContext(sconf);
try {
HBaseAdmin.checkHBaseAvailable(conf);
System.out.println("HBase is running");
} catch (Exception e) {
System.out.println("HBase is not running");
e.printStackTrace();
}
SQLContext sqlContext = new SQLContext(jsc);
String sqlMapping ="ROW String :ROW,city STRING r:city";
HashMap<String, String> map = new HashMap<String, String>();
map.put("hbase.columns.mapping", sqlMapping);
map.put("hbase.table", "emp1");
DataFrame dataFrame1 = sqlContext.read().format("org.apache.hadoop.hbase.spark").options(map).load();
例外: - 螺紋 異常 「主」 java.lang.IllegalArgumentException異常:對hbase.columns.mapping值無效「行字符串:ROW,城市的串R:全市 在組織.apache.hadoop.hbase.spark.DefaultSource.generateSchemaMappingMap(DefaultSource.scala:119) at org.apache.hadoop.hbase.spark.DefaultSource.createRelation(DefaultSource.scala:79) at org.apache.spark.sql .execution.datasources.ResolvedDataSource $ .apply(ResolvedDataSource.scala:158) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119) at org.apache.spark.sql.DataFrameReade r.load(DataFrameReader.scala:109) 在dataframe.ParquetExample.main(ParquetExample.java:94) 引起:java.lang.IllegalArgumentException異常:不支持的列類型:字符串