0
我有以下格式有數據的CSV文件:忽略來自CSV文件的報價,同時加載到HIVE表
"SomeName1",25,"SomeString1"
"SomeName2",26,"SomeString2"
"SomeName3",27,"SomeString3"
我加載這個CSV成蜂巢表。在表中,第1列和第3列與我不想要的引號一起插入。我想第1欄是SomeName1
和第3列是SomeString1
我與
WITH SERDEPROPERTIES (
"separatorChar" = "\t",
"quoteChar" = "\""
)
嘗試,但它不工作,並保持了「」。
這裏應該採取什麼方法?
表創建語句:
CREATE TABLE `abcdefgh`(
`name` string COMMENT 'from deserializer',
`age` string COMMENT 'from deserializer',
`value` string COMMENT 'from deserializer')
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
'quoteChar'='\"',
'separatorChar'='\t')
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://a-b-c-d-e:9000/user/hive/warehouse/abcdefgh'
TBLPROPERTIES (
'numFiles'='1',
'numRows'='0',
'rawDataSize'='0',
'totalSize'='3134916',
'transient_lastDdlTime'='1490713221')
SERDE哪些您使用的是?請發佈完整的表創建查詢 – cheseaux
ROW FORMAT SERDE'org.apache.hadoop.hive.serde2.OpenCSVSerde' – earl
CREATE TABLE'abcdefgh'( 'name' string COMMENT'from deserializer', 'age' string COMMENT'from解串器」, 'value'串COMMENT '從解串器') 行格式SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES( 'quoteChar'= '\「', 'separatorChar' = '\ T') 存儲爲INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'HDFS:// ABCDE:9000 /用戶/蜂巢/倉庫/ ABCDEFGH' TBLPROPERTIES( 'numFiles'= '1', 'numRows行'= '0', 'rawDataSize'= '0', ,) – earl