2016-03-31 54 views
0

嘗試在配置單元中導入data以下。配置單元查詢csv文本分隔符問題

姓名,電話,地址

Arverne,(718) 634-4784,"*312 Beach 54 Street 
Arverne, NY 11692 
(40.59428994144626, -73.78442865540268)*" 

Astoria,(718) 278-2220,"*14 01 Astoria Boulevard 
Long Island City, NY 11102 
(40.77152402451418, -73.92643545073543)*" 

Auburndale,(718) 352-2027,"*25 55 Francis Lewis Boulevard 
Flushing, NY 11358 
(40.76035096822195, -73.79632645819947)*" 

但是地址不正確來臨,從而損壞表數據 我想這個問題與(取默認\ N,因爲地址是3-4終止線線)時,使得當我跑低於採樣數據

a,b,"e,f" 

x,y,"l,m" 

下面查詢

create table test(c1 string, c2 string, c3 string) 
row format serde 'com.bizo.hive.serde.csv.CSVSerde' 
with serdeproperties(
"separatorChar" = ","); 

其做工精細:

test.c1 test.c2 test.c3

a b c,d 

e f g,z 

如何做到這一點?

回答

0

這就是我已經制定出來的。

>>> CREATE TABLE Test(name string, phone string, address string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE; 
>>> load data inpath 'file.csv' into table Test; 

>>> select name from hiveTest; 
+-------------+--+ 
| name  | 
+-------------+--+ 
| Arverne  | 
| Astoria  | 
| Auburndale | 
+-------------+--+ 
>>> select address from hiveTest; 
+--------------------------------------------+--+ 
|     address     | 
+--------------------------------------------+--+ 
| "312 Beach 54 Street Arverne    | 
| "14 01 Astoria Boulevard Long Island City | 
| "25 55 Francis Lewis Boulevard Flushing | 
+--------------------------------------------+--+ 

我想它有幫助。

+0

地址被截斷。它假設爲「312 Beach 54 Street Arverne,NY 11692(40.59428994144626,-73.78442865540268)」 – sr7

+0

試試這個:create table my_table(name string,phone string,address string)row format serde'com.bizo.hive.serde。 )以serdeproperties(「separatorChar」=「\ t」,「quoteChar」=「'」,「escapeChar」=「\\」)存儲爲文本文件的「csv.CSVSerde」根據要求更改serdeproperties。 – srikanth

+0

已經嘗試使用這些選項(「separatorChar」=「,」,「quoteChar」=「\」「,」escapeChar「=」\ n「)....再次不工作..你可以從這個實際的數據鏈接:https://nycopendata.socrata.com/Recreation/Queens-Library-Branches/kh3d-xhq7? – sr7