2017-05-12 38 views
0

如何使用dataframe(spark sql)從Hbase獲取數據而不使用javaRDD。 代碼: -如何通過數據框從hbase獲取數據而不使用javaRDD

SparkConf sconf = new SparkConf().setMaster("local").setAppName("Test"); 

    Configuration conf = HBaseConfiguration.create(); 
    JavaSparkContext jsc = new JavaSparkContext(sconf); 
    try { 
     HBaseAdmin.checkHBaseAvailable(conf); 
     System.out.println("HBase is running"); 
    } catch (Exception e) { 
     System.out.println("HBase is not running"); 
     e.printStackTrace(); 
    } 
    SQLContext sqlContext = new SQLContext(jsc); 

    String sqlMapping ="ROW String :ROW,city STRING r:city"; 
    HashMap<String, String> map = new HashMap<String, String>(); 
    map.put("hbase.columns.mapping", sqlMapping); 
    map.put("hbase.table", "emp1"); 
    DataFrame dataFrame1 = sqlContext.read().format("org.apache.hadoop.hbase.spark").options(map).load(); 

例外: - 螺紋 異常 「主」 java.lang.IllegalArgumentException異常:對hbase.columns.mapping值無效「行字符串:ROW,城市的串R:全市 在組織.apache.hadoop.hbase.spark.DefaultSource.generateSchemaMappingMap(DefaultSource.scala:119) at org.apache.hadoop.hbase.spark.DefaultSource.createRelation(DefaultSource.scala:79) at org.apache.spark.sql .execution.datasources.ResolvedDataSource $ .apply(ResolvedDataSource.scala:158) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119) at org.apache.spark.sql.DataFrameReade r.load(DataFrameReader.scala:109) 在dataframe.ParquetExample.main(ParquetExample.java:94) 引起:java.lang.IllegalArgumentException異常:不支持的列類型:字符串

回答

0
i have solve this exception "Unsupported column type :String" but 
    now getting another issue. 

Exception in thread "main" java.lang.NullPointerException 
    at org.apache.hadoop.hbase.spark.HBaseRelation.<init>(DefaultSource.scala:175) 
    at org.apache.hadoop.hbase.spark.DefaultSource.createRelation(DefaultSource.scala:78) 
    at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158) 
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119) 
    at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1140) 
相關問題