2016-10-21 42 views
0

POM依賴獲得行和字段分隔符:如何使用HCatalog的Java API

<dependency> 
     <groupId>org.apache.hive.hcatalog</groupId> 
     <artifactId>hive-webhcat-java-client</artifactId> 
     <version>1.2.1</version> 
    </dependency> 

我能夠得到列,分區列,輸入文件格式等

有用的代碼:

HiveConf hcatConf = new HiveConf(); 

    hcatConf.setVar(HiveConf.ConfVars.METASTOREURIS, connectionUri); 
    hcatConf.set("hive.metastore.local", "false"); 
    hcatConf.setIntVar(HiveConf.ConfVars.METASTORETHRIFTCONNECTIONRETRIES, THRIFT_CONNECTION_RETRY); 
    hcatConf.set(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY.varname, "true"); 
    hcatConf.set(HiveConf.ConfVars.SEMANTIC_ANALYZER_HOOK.varname, HCatSemanticAnalyzer.class.getName()); 
    hcatConf.set(HiveConf.ConfVars.PREEXECHOOKS.varname, ""); 
    hcatConf.set(HiveConf.ConfVars.POSTEXECHOOKS.varname, ""); 

    hcatConf.setTimeVar(HiveConf.ConfVars.METASTORE_CLIENT_SOCKET_TIMEOUT, TIME_OUT, TimeUnit.MILLISECONDS); 

    HCatClient client = null; 
    HCatTable hTable = null; 

    try { 
     client = HCatClient.create(hcatConf); 
     hTable = client.getTable(databaseName, tableName); 
     System.out.println(hTable.getInputFileFormat()); 
     System.out.println(hTable.getOutputFileFormat()); 
     System.out.println(hTable.getSerdeLib()); 

    } catch (HCatException hCatEx) { 
     LOG.error("Not able to connect to hive. Caused By;", hCatEx); 
    } 

如何獲取文本表的行和字段分隔符?

作爲每getSerdeParams()的Javadoc,

public Map<String,String> getSerdeParams()
- 返回參數,如字段分隔符,等。

但對我來說如果我創建一個表,我得到這個地圖

{serialization.format=1} 

回答

0

只有1項:

create table tbl1 (c1 int) stored as textfile 

當我運行show create table tbl1

CREATE TABLE `tbl1`(
    `c1` int) 
ROW FORMAT SERDE 
    'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
STORED AS INPUTFORMAT 
    'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
    'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' 
LOCATION 
    'hdfs://localhost:8020/apps/hive/warehouse/dev.db/tbl1' 
TBLPROPERTIES (
    'transient_lastDdlTime'='1477067078') 

沒有默認分隔符rs顯示。

當我創建帶分隔符的表:

create table tbl2 (c1 int) ROW FORMAT DELIMITED FIELDS TERMINATED BY "\," LINES TERMINATED BY "\n" stored as textfile; 

當我運行show create table tbl2

CREATE TABLE `tbl2`(
    `c1` int) 
ROW FORMAT DELIMITED 
    FIELDS TERMINATED BY ',' 
    LINES TERMINATED BY '\n' 
STORED AS INPUTFORMAT 
    'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
    'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' 
LOCATION 
    'hdfs://localhost:8020/apps/hive/warehouse/dev.db/tbl2' 
TBLPROPERTIES (
    'transient_lastDdlTime'='1477067160') 

在第二種情況我明確提到分隔符。因此,getSerdeParams()返回了所需的值。