2017-01-10 84 views
0

數據看起來像下面如何在此數據上創建表?

name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg 
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg 
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg 

在這個數據上,我創建了一個表格,根據數據,但映射,數據是恆定的,直到有開始像下面

新生產線的關鍵

新線:

name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg 
SIA:uewi||Age:30||Place:Ohio||Qtype:Jame/tyler/on.txt/||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg 
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg 
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg 
SIA:uewi||Age:30||Place:Ohio||Qtype:Jame/tyler/on.txt/||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg 
name:Jack Reacher||Age:30||Place:Ohio||ID:43730||inorg:abcd office||file:qwertyu/werty/ghj/dfhj.jpg 

如何創建表及模式呢?我嘗試過通過表格映射字符串,但它沒有解決。

你能告訴我要使用的分隔符來創建一個表,並獲得鍵值的數據。

我已經試過

Create table dataset (
    name string, 
    SIA string, 
    Age string, 
    Place string, 
    Qtype string, 
    ID string, 
    inorg string, 
    file string 
) ROW SEPERATED BY '||' stored as textfile; 

回答

0

你必須編寫自定義格式SERDE爲您指定下列任何類別不屬於格式。

Avro (Hive 0.9.1 and later) 
ORC (Hive 0.11 and later) 
RegEx 
Thrift 
Parquet (Hive 0.13 and later) 
CSV (Hive 0.14 and later) 
JsonSerDe (Hive 0.12 and later in hcatalog-core) 

要麼你需要修改數據文件和repalce ||用,並使其JSON,然後用JsonSerDe

,或者嘗試RegEx

+0

嘿,我已經試過ASLO創建外部表(名稱映射<字符串,字符串>分隔字符串,SIA地圖<字符串,字符串>分隔符串, – RHarsha

+0

什麼都使用外部表或表中,我們需要SERDE爲您的數據是不是在任何標準格式的.. –

+0

沒有任何可行的解決方案呢? – RHarsha