2017-03-28 61 views
0

我有以下格式有數據的CSV文件:忽略來自CSV文件的報價,同時加載到HIVE表

"SomeName1",25,"SomeString1" 
"SomeName2",26,"SomeString2" 
"SomeName3",27,"SomeString3" 

我加載這個CSV成蜂巢表。在表中,第1列和第3列與我不想要的引號一起插入。我想第1欄是SomeName1和第3列是SomeString1

我與

WITH SERDEPROPERTIES (
    "separatorChar" = "\t", 
    "quoteChar"  = "\"" 
) 

嘗試,但它不工作,並保持了「」。

這裏應該採取什麼方法?

表創建語句:

CREATE TABLE `abcdefgh`(
    `name` string COMMENT 'from deserializer', 
    `age` string COMMENT 'from deserializer', 
    `value` string COMMENT 'from deserializer') 
ROW FORMAT SERDE 
    'org.apache.hadoop.hive.serde2.OpenCSVSerde' 
WITH SERDEPROPERTIES (
    'quoteChar'='\"', 
    'separatorChar'='\t') 
STORED AS INPUTFORMAT 
    'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
    'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' 
LOCATION 
    'hdfs://a-b-c-d-e:9000/user/hive/warehouse/abcdefgh' 
TBLPROPERTIES (
    'numFiles'='1', 
    'numRows'='0', 
    'rawDataSize'='0', 
    'totalSize'='3134916', 
    'transient_lastDdlTime'='1490713221') 
+0

SERDE哪些您使用的是?請發佈完整的表創建查詢 – cheseaux

+0

ROW FORMAT SERDE'org.apache.hadoop.hive.serde2.OpenCSVSerde' – earl

+0

CREATE TABLE'abcdefgh'( 'name' string COMMENT'from deserializer', 'age' string COMMENT'from解串器」, 'value'串COMMENT '從解串器') 行格式SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES( 'quoteChar'= '\「', 'separatorChar' = '\ T') 存儲爲INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'HDFS:// ABCDE:9000 /用戶/蜂巢/倉庫/ ABCDEFGH' TBLPROPERTIES( 'numFiles'= '1', 'numRows行'= '0', 'rawDataSize'= '0', ,) – earl

回答

2

你的分隔符應該是一個逗號:"separatorChar" = ','

create external table mytable 
(
    col1 string 
    ,col2 int 
    ,col3 string 
) 
row format serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde' 
with serdeproperties 
(
    "separatorChar" = ',' 
    ,"quoteChar"  = '"' 
) 
stored as textfile 
; 

select * from mytable 
; 

+--------------+--------------+--------------+ 
| mytable.col1 | mytable.col2 | mytable.col3 | 
+--------------+--------------+--------------+ 
| SomeName1 |   25 | SomeString1 | 
| SomeName2 |   26 | SomeString2 | 
| SomeName3 |   27 | SomeString3 | 
+--------------+--------------+--------------+ 
+0

按預期工作。謝謝!! – earl