2013-12-10 42 views
0

我按配置單元劃分年份。我創建了一個腳本:配置單元 - 按年份劃分

DROP TABLE movies_byYear; 

CREATE TABLE movies_byYear (title STRING, full_name STRING, ep_name STRING, type STRING, ep_num STRING, suspended BOOLEAN) PARTITIONED BY (year INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE; 

INSERT OVERWRITE TABLE movies_byYear PARTITION (year='2013') SELECT title, full_name, ep_name, type, ep_num, suspended FROM movies WHERE year='2013'; 

然而,當使用:SELECT COUNT(*) FROM movies WHERE year='2013';

我沒有得到由2013年回所有的電影,而不是我得到的所有電影回來。

是否有可能讓蜂房決定在哪裏進行分區?

我真的很感謝你的回答!

UPDATE

當添加year我得到:

INSERT OVERWRITE TABLE movies_byYear PARTITION (year=2013) SELECT title, full_name, ep_name, type, ep_num, suspended, year FROM movies WHERE year=2013; 

FAILED: SemanticException [Error 10044]: Line 1:23 Cannot insert into target table because column number/types are different '2013': Table insclause-0 has 6 columns, but query has 7 columns. 

回答

2

插入時,可以插入:

SELECT title, full_name, ep_name, type, ep_num, suspended 

添加年到...目前您year場movies_byYear爲空...

當您在配置單元的創建表語句中指定分區year時,year將成爲表格中的一列!

UPDATE

替換此

INSERT OVERWRITE TABLE movies_byYear PARTITION (year='2013') SELECT title, full_name, ep_name, type, ep_num, suspended FROM movies WHERE year='2013';

與此:

INSERT OVERWRITE TABLE movies_byYear PARTITION (year=2013) SELECT title, full_name, ep_name, type, ep_num, suspended FROM movies WHERE year='2013';

也就是說,去掉單引號括起來的年份值在分區...

+0

Thx爲您的答案!但是,它不工作...請參閱我的更新!順便說一句,它也可以讓蜂房選擇分區? – mrquad

+0

您將年份指定爲INT,然後使用Year ='2013'(STRING)... 嘗試更改:PARTITION(year ='2013')to PARTITION(year = 2013) – rkh

+0

Thx for your answer!但是,我不明白。在我的更新中,我沒有使用「...」 – mrquad