我按配置單元劃分年份。我創建了一個腳本:配置單元 - 按年份劃分
DROP TABLE movies_byYear;
CREATE TABLE movies_byYear (title STRING, full_name STRING, ep_name STRING, type STRING, ep_num STRING, suspended BOOLEAN) PARTITIONED BY (year INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
INSERT OVERWRITE TABLE movies_byYear PARTITION (year='2013') SELECT title, full_name, ep_name, type, ep_num, suspended FROM movies WHERE year='2013';
然而,當使用:SELECT COUNT(*) FROM movies WHERE year='2013';
我沒有得到由2013年回所有的電影,而不是我得到的所有電影回來。
是否有可能讓蜂房決定在哪裏進行分區?
我真的很感謝你的回答!
UPDATE
當添加year
我得到:
INSERT OVERWRITE TABLE movies_byYear PARTITION (year=2013) SELECT title, full_name, ep_name, type, ep_num, suspended, year FROM movies WHERE year=2013;
FAILED: SemanticException [Error 10044]: Line 1:23 Cannot insert into target table because column number/types are different '2013': Table insclause-0 has 6 columns, but query has 7 columns.
Thx爲您的答案!但是,它不工作...請參閱我的更新!順便說一句,它也可以讓蜂房選擇分區? – mrquad
您將年份指定爲INT,然後使用Year ='2013'(STRING)... 嘗試更改:PARTITION(year ='2013')to PARTITION(year = 2013) – rkh
Thx for your answer!但是,我不明白。在我的更新中,我沒有使用「...」 – mrquad