我跑的查詢使用和不使用SMB連接,得到了不同的結果。請幫忙解釋一下。蜂巢排序合併桶地圖(SMB地圖)加入
SET hive.enforce.bucketing=true;
create table dbaproceduresbuckets (
owner string ,
object_name string ,
procedure_name string ,
object_id double ,
subprogram_id double ,
overload string ,
object_type string ,
aggregate string ,
pipelined string ,
impltypeowner string ,
impltypename string ,
parallel string ,
interface string ,
deterministic string ,
authid string)
CLUSTERED BY (object_id) SORTED BY (OBJECT_ID ASC) INTO 32 BUCKETS;
CREATE TABLE dbaobjectsbuckets1(
owner string,
object_name string,
subobject_name string,
object_id double,
data_object_id double,
object_type string,
created string,
last_ddl_time string,
timestamp string,
status string,
temporary string,
generated string,
secondary string,
namespace double,
edition_name string) CLUSTERED BY (object_id) SORTED BY (OBJECT_ID ASC) INTO 32 BUCKETS;
**** load the table;
0:JDBC:hive2:// XXXXXX:從dbaobjectsbuckets1 10000> SELECT COUNT(*),dbaproceduresbuckets b 0:JDBC:hive2:// XXXXXXXX:10000>其中a.object_id = B。 OBJECT_ID; 信息:Stage-2的Hadoop作業信息:mappers的數量:3;減速器的數目:1 INFO:2016年6月13日15:56:00381階段-2地圖= 0%,減少= 0% INFO:2016年6月13日15:56:55818階段-2地圖= 1% ,減少= 0%,累積CPU 122.6秒 INFO:2016年6月13日15:57:47124階段-2地圖= 7%,減少= 0%,累積CPU 326.86秒 ......... 信息:2016-06-13 16:05:01,246階段2映射= 100%,減少= 100%,累積CPU 867.1秒 信息:MapReduce總累計CPU時間:14分27秒100毫秒 信息:已結束工作= job_1464280256859_0146 + -------- + - + | _c0 | + -------- + - + | 54876 | + -------- + - +
****
set hive.auto.convert.sortmerge.join=true;
set hive.optimize.bucketmapjoin=true;
set hive.optimize.bucketmapjoin.sortedmerge=true;
set hive.auto.convert.sortmerge.join.noconditionaltask=true;
set hive.enforce.bucketing=true;
set hive.enforce.sorting=true;
0: jdbc:hive2://xxxxxxx:10000> select count(*) from dbaobjectsbuckets1 a, dbaproceduresbuckets b
0:JDBC:hive2:// XXXXXXXX:10000>其中a.object_id = b.object_id;
in the execution plan, I am seeing
| Sorted合併桶映射聯合運算符| |條件圖:| |內部加入0到1 | |鍵:| | 0 object_id(type:double)| | 1周的object_id(類型:雙)
**** but the result is showing
INFO : Hadoop job information for Stage-1: number of mappers: 32; number of reducers: 1
......
INFO : MapReduce Total cumulative CPU time: 4 minutes 8 seconds 490 msec
INFO:結束作業= job_1464280256859_0150 + ------ + - + | _c0 | + ------ + - + | 2 | + ------ + - +
?????我的問題是爲什麼當我使用SMB連接時只有2個?它應該是54876.
謝謝!