我想這可能是一個較舊的討論的一部分,而不是延長到一個論壇類型的談話,我認爲它相信回覆的評論者如果我打開它作爲一個單獨的問題。Teradata SQL:PDCR表加入::有人可以解釋行數差距

我想了解爲什麼這兩個查詢給出稍微不同的結果,重要的是一個錯過了一個IMP候選用戶。 一個簡單的報告,通過數據庫拉高CPU用戶。


SELECT b.objectdatabasename, 
    CAST(SUM((((a.AmpCPUTime(DEC(18, 3))) + ZEROIFNULL(a.ParserCPUTime)))) AS DECIMAL(18, 3)) 
    pdcrinfo.dbqlogtbl a 
      SELECT queryid, 
       MIN(objectdatabasename) AS objectdatabasename 
      FROM pdcrinfo.dbqlobjtbl_hst 
      WHERE objectdatabasename IN (
        SELECT child 
        FROM dbc.children 
        WHERE parent = 'findb' 
        GROUP BY 1 
      GROUP BY 1, 2 
     ) b ON 
      a.queryid = b.queryid 
      AND a.loGDATE = b.Logdate 
      AND a.logdate BETWEEN x AND y 
      AND b.logdate BETWEEN x AND y 



SELECT b.objectdatabasename, 
    CAST(SUM((((a.AmpCPUTime(DEC(18, 3))) + ZEROIFNULL(a.ParserCPUTime)))) AS DECIMAL(18, 3)) 
    pdcrinfo.dbqlogtbl a 
      SELECT queryid, 
       MIN(objectdatabasename) AS objectdatabasename 
      FROM pdcrinfo.dbqlobjtbl_hst 
      GROUP BY 1, 2 
     ) b ON 
     a.queryid = b.queryid 
     AND a.loGDATE = b.Logdate 
     AND a.logdate BETWEEN x AND y 
     AND b.logdate BETWEEN x AND y 
    b.objectdatabasename IN 
      SELECT child 
      FROM dbc.children 
      WHERE parent = 'findb' 
      GROUP BY 1 


| Database | User | Total CPU | 
| FinDB  | PSmith | 500,000 | 
| FinDB_B | PROgers | 600,000 | 
| ClaimDB_CO | BCRPRDUsr | 700,000 | 

第1版是現存一個用這麼長的時間(以及使用它的另一種低效率的形式),並錯過了這個用戶 FinDB PSmith 500,000 我從查詢ID和logdates爲PSmith確實檢查他正在與FinDB合作,但他從未將它列入版本#2。 我相信 - 我錯過了101,並試圖瞭解是什麼導致了行差距。


Explain SELECT 

    b.objectdatabasename , 
    a.username , 
    CAST(SUM((((a.AmpCPUTime(DEC(18,3)))+ ZEROIFNULL(a.ParserCPUTime)))) AS DECIMAL(18,3)) (TITLE '') 
    FROM pdcrinfo.dbqlogtbl a 
    SELECT queryid,logdate, 
     MIN (objectdatabasename) AS     objectdatabasename 
     FROM pdcrinfo.dbqlobjtbl_hst 
     GROUP BY 1,2) 
          ON (a.queryid=b.queryid 
    AND a.loGDATE=b.Logdate 

and a.logdate BETWEEN     '2016-01-01' AND '2016-01-11' 
     and b.logdate BETWEEN    '2016-01-01' AND '2016-01-11' 

     where b.objectdatabasename in (sel child from dbc.children where parent ='findb' group by 1 ) 
    GROUP BY 1, 

    ORDER BY 3 desc , 2 asc, 1 asc; 

This query is optimized using type 2 profile insert-sel, profileid 
    1) First, we lock PDCRDATA.DBQLObjTbl_Hst for access, and we lock 
    PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl for access. 
    2) Next, we lock DBC.dbase for access, and we lock DBC.owners for 
    3) We do an all-AMPs SUM step to aggregate from 11 partitions of 
    PDCRDATA.DBQLObjTbl_Hst with a condition of (
    "(PDCRDATA.DBQLObjTbl_Hst.LogDate >= DATE '2016-01-01') AND 
    (PDCRDATA.DBQLObjTbl_Hst.LogDate <= DATE '2016-01-11')") 
    , grouping by field1 (PDCRDATA.DBQLObjTbl_Hst.QueryID 
    ,PDCRDATA.DBQLObjTbl_Hst.LogDate). Aggregate Intermediate Results 
    are computed locally, then placed in Spool 3. The input table 
    will not be cached in memory, but it is eligible for synchronized 
    scanning. The size of Spool 3 is estimated with low confidence to 
    be 44,305,297 rows (5,715,383,313 bytes). The estimated time for 
    this step is 8.52 seconds. 
    4) We execute the following steps in parallel. 
     1) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by 
      way of an all-rows scan into Spool 1 (used to materialize 
      view, derived table, table function or table operator b) 
      (all_amps) (compressed columns allowed), which is built 
      locally on the AMPs. The size of Spool 1 is estimated with 
      low confidence to be 44,305,297 rows (5,316,635,640 bytes). 
      The estimated time for this step is 0.78 seconds. 
     2) We do an all-AMPs RETRIEVE step from DBC.dbase by way of an 
      all-rows scan with a condition of (
      CHARACTER SET LATIN, NOT CASESPECIFIC))= 'findb '") into Spool 
      9 (all_amps) (compressed columns allowed), which is 
      redistributed by the hash code of (DBC.dbase.DatabaseId) to 
      all AMPs. Then we do a SORT to order Spool 9 by row hash. 
      The size of Spool 9 is estimated with no confidence to be 348 
      rows (5,916 bytes). The estimated time for this step is 0.01 
     3) We do an all-AMPs RETRIEVE step from DBC.dbase by way of an 
      all-rows scan with no residual conditions locking for access 
      into Spool 10 (all_amps) (compressed columns allowed), which 
      is redistributed by the hash code of (DBC.dbase.DatabaseId) 
      to all AMPs. Then we do a SORT to order Spool 10 by row hash. 
      The size of Spool 10 is estimated with high confidence to be 
      3,478 rows (361,712 bytes). The estimated time for this step 
      is 0.01 seconds. 
    5) We do an all-AMPs JOIN step from Spool 9 (Last Use) by way of a 
    RowHash match scan, which is joined to DBC.owners by way of a 
    RowHash match scan with no residual conditions. Spool 9 and 
    DBC.owners are joined using a merge join, with a join condition of 
    ("DBC.owners.OwnerId = DatabaseId"). The result goes into Spool 
    11 (all_amps) (compressed columns allowed), which is redistributed 
    by the hash code of (DBC.owners.OwneeId) to all AMPs. Then we do 
    a SORT to order Spool 11 by row hash. The size of Spool 11 is 
    estimated with no confidence to be 10,450 rows (177,650 bytes). 
    The estimated time for this step is 0.02 seconds. 
    6) We do an all-AMPs JOIN step from Spool 10 (Last Use) by way of a 
    RowHash match scan, which is joined to Spool 11 (Last Use) by way 
    of a RowHash match scan. Spool 10 and Spool 11 are joined using a 
    merge join, with a join condition of ("OwneeId = DatabaseId"). 
    The result goes into Spool 8 (all_amps), which is redistributed by 
    the hash code of (SUBSTRING((TRANSLATE((DBC.dbase.DatabaseName 
    Then we do a SORT to order Spool 8 by row hash and the sort key in 
    spool field1 eliminating duplicate rows. The size of Spool 8 is 
    estimated with no confidence to be 3,478 rows (191,290 bytes). 
    The estimated time for this step is 0.02 seconds. 
    7) We do an all-AMPs RETRIEVE step from Spool 8 (Last Use) by way of 
    an all-rows scan into Spool 12 (all_amps) (compressed columns 
    allowed), which is duplicated on all AMPs. The size of Spool 12 
    is estimated with no confidence to be 1,752,912 rows (227,878,560 
    bytes). The estimated time for this step is 0.06 seconds. 
    8) We do an all-AMPs JOIN step from Spool 1 (Last Use) by way of an 
    all-rows scan with a condition of ("(b.LOGDATE <= DATE 
    '2016-01-11') AND (b.LOGDATE >= DATE '2016-01-01')"), which is 
    joined to Spool 12 (Last Use) by way of an all-rows scan. Spool 1 
    and Spool 12 are joined using a inclusion dynamic hash join, with 
    a join condition of ("OBJECTDATABASENAME = (TRANSLATE((Field_2 
    )USING LATIN_TO_UNICODE))"). The result goes into Spool 13 
    (all_amps) (compressed columns allowed), which is redistributed by 
    the rowkey of (PDCRDATA.DBQLObjTbl_Hst.LOGDATE, 
    PDCRDATA.DBQLObjTbl_Hst.QUERYID) to all AMPs. Then we do a SORT 
    to partition Spool 13 by rowkey. The size of Spool 13 is 
    estimated with no confidence to be 3,865 rows (432,880 bytes). 
    The estimated time for this step is 0.29 seconds. 
    9) We do an all-AMPs JOIN step from 11 partitions of 
    PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl by way of a 
    RowHash match scan with a condition of ("(PDCRDATA.DBQLogTbl_Hst 
    in view pdcrinfo.dbqlogtbl.LogDate <= DATE '2016-01-11') AND 
    (PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl.LogDate >= DATE 
    '2016-01-01')"), which is joined to Spool 13 (Last Use) by way of 
    a RowHash match scan. PDCRDATA.DBQLogTbl_Hst and Spool 13 are 
    joined using a rowkey-based merge join, with a join condition of (
    "(PDCRDATA.DBQLogTbl_Hst.LogDate = LOGDATE) AND 
    (PDCRDATA.DBQLogTbl_Hst.QueryID = QUERYID)"). The input table 
    PDCRDATA.DBQLogTbl_Hst will not be cached in memory, but it is 
    eligible for synchronized scanning. The result goes into Spool 7 
    (all_amps) (compressed columns allowed), which is built locally on 
    the AMPs. The size of Spool 7 is estimated with no confidence to 
    be 3,816 rows (782,280 bytes). The estimated time for this step 
    is 0.03 seconds. 
10) We do an all-AMPs SUM step to aggregate from Spool 7 (Last Use) by 
    way of an all-rows scan , grouping by field1 (
    PDCRDATA.DBQLObjTbl_Hst.Field_4 ,PDCRDATA.DBQLogTbl_Hst.UserName). 
    Aggregate Intermediate Results are computed globally, then placed 
    in Spool 15. The size of Spool 15 is estimated with no confidence 
    to be 3,478 rows (2,472,858 bytes). The estimated time for this 
    step is 0.02 seconds. 
11) We do an all-AMPs RETRIEVE step from Spool 15 (Last Use) by way of 
    an all-rows scan into Spool 5 (group_amps), which is built locally 
    on the AMPs. Then we do a SORT to order Spool 5 by the sort key 
    in spool field1 (SUM((PDCRDATA.DBQLogTbl_Hst.AMPCPUTime 
    )))(DECIMAL(18,3)), PDCRDATA.DBQLogTbl_Hst.UserName, 
    PDCRDATA.DBQLObjTbl_Hst.Field_4). The size of Spool 5 is 
    estimated with no confidence to be 3,478 rows (2,201,574 bytes). 
    The estimated time for this step is 0.01 seconds. 
12) Finally, we send out an END TRANSACTION step to all AMPs involved 
    in processing the request. 
    -> The contents of Spool 5 are sent back to the user as the result of 
    statement 1. The total estimated time is 9.75 seconds. 


Explain SELECT 

    b.objectdatabasename , 
    a.username , 
     ZEROIFNULL(a.ParserCPUTime)))) AS DECIMAL(18,3)) 
    FROM pdcrinfo.dbqlogtbl a 
    SELECT queryid,logdate, 
     MIN (objectdatabasename) AS     objectdatabasename 
     FROM pdcrinfo.dbqlobjtbl_hst 
     where objectdatabasename in (sel child from dbc.children where parent ='findb' group by 1 ) 
     GROUP BY 1,2) 
          ON (a.queryid=b.queryid 
     AND a.loGDATE=b.Logdate) 

AND a.logdate BETWEEN     '2016-01-01' AND '2016-01-11' 
      AND b.logdate BETWEEN     '2016-01-01' AND '2016-01-11' 

order by 
3 desc, 1 asc, 2 asc; 

This query is optimized using type 2 profile insert-sel, profileid 
    1) First, we lock PDCRDATA.DBQLObjTbl_Hst for access, and we lock 
    PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl for access. 
    2) Next, we lock DBC.dbase for access, and we lock DBC.owners for 
    3) We execute the following steps in parallel. 
     1) We do an all-AMPs RETRIEVE step from DBC.dbase by way of an 
      all-rows scan with a condition of (
      CHARACTER SET LATIN, NOT CASESPECIFIC))= 'findb '") into Spool 
      5 (all_amps) (compressed columns allowed), which is 
      redistributed by the hash code of (DBC.dbase.DatabaseId) to 
      all AMPs. Then we do a SORT to order Spool 5 by row hash. 
      The size of Spool 5 is estimated with no confidence to be 348 
      rows (5,916 bytes). The estimated time for this step is 0.01 
     2) We do an all-AMPs RETRIEVE step from DBC.dbase by way of an 
      all-rows scan with no residual conditions locking for access 
      into Spool 6 (all_amps) (compressed columns allowed), which 
      is redistributed by the hash code of (DBC.dbase.DatabaseId) 
      to all AMPs. Then we do a SORT to order Spool 6 by row hash. 
      The size of Spool 6 is estimated with high confidence to be 
      3,478 rows (361,712 bytes). The estimated time for this step 
      is 0.01 seconds. 
    4) We do an all-AMPs JOIN step from Spool 5 (Last Use) by way of a 
    RowHash match scan, which is joined to DBC.owners by way of a 
    RowHash match scan with no residual conditions. Spool 5 and 
    DBC.owners are joined using a merge join, with a join condition of 
    ("DBC.owners.OwnerId = DatabaseId"). The result goes into Spool 7 
    (all_amps) (compressed columns allowed), which is redistributed by 
    the hash code of (DBC.owners.OwneeId) to all AMPs. Then we do a 
    SORT to order Spool 7 by row hash. The size of Spool 7 is 
    estimated with no confidence to be 10,450 rows (177,650 bytes). 
    The estimated time for this step is 0.02 seconds. 
    5) We execute the following steps in parallel. 
     1) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of 
      a RowHash match scan, which is joined to Spool 7 (Last Use) 
      by way of a RowHash match scan. Spool 6 and Spool 7 are 
      joined using a merge join, with a join condition of (
      "OwneeId = DatabaseId"). The result goes into Spool 4 
      (all_amps), which is redistributed by the hash code of (
      do a SORT to order Spool 4 by row hash and the sort key in 
      spool field1 eliminating duplicate rows. The size of Spool 4 
      is estimated with no confidence to be 3,478 rows (191,290 
      bytes). The estimated time for this step is 0.02 seconds. 
     2) We do an all-AMPs RETRIEVE step from 11 partitions of 
      PDCRDATA.DBQLObjTbl_Hst with a condition of (
      "(PDCRDATA.DBQLObjTbl_Hst.LogDate >= DATE '2016-01-01') AND 
      (PDCRDATA.DBQLObjTbl_Hst.LogDate <= DATE '2016-01-11')") into 
      Spool 8 (all_amps) (compressed columns allowed), which is 
      built locally on the AMPs. The input table will not be 
      cached in memory, but it is eligible for synchronized 
      scanning. The size of Spool 8 is estimated with high 
      confidence to be 109,751,471 rows (12,292,164,752 bytes). 
      The estimated time for this step is 4.29 seconds. 
    6) We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by way of 
    an all-rows scan into Spool 9 (all_amps) (compressed columns 
    allowed), which is duplicated on all AMPs. The size of Spool 9 is 
    estimated with no confidence to be 1,752,912 rows (227,878,560 
    bytes). The estimated time for this step is 0.06 seconds. 
    7) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of an 
    all-rows scan, which is joined to Spool 9 (Last Use) by way of an 
    all-rows scan. Spool 8 and Spool 9 are joined using a single 
    partition inclusion hash join, with a join condition of (
    "ObjectDatabaseName = (TRANSLATE((Field_2)USING 
    LATIN_TO_UNICODE))"). The result goes into Spool 3 (all_amps) 
    (compressed columns allowed), which is built locally on the AMPs. 
    The size of Spool 3 is estimated with no confidence to be 
    36,436,341 rows (4,153,742,874 bytes). The estimated time for 
    this step is 1.05 seconds. 
    8) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by 
    way of an all-rows scan , grouping by field1 (
    PDCRDATA.DBQLObjTbl_Hst.QueryID ,PDCRDATA.DBQLObjTbl_Hst.LogDate). 
    Aggregate Intermediate Results are computed locally, then placed 
    in Spool 11. The size of Spool 11 is estimated with no confidence 
    to be 36,436,341 rows (4,700,287,989 bytes). The estimated time 
    for this step is 3.10 seconds. 
    9) We do an all-AMPs RETRIEVE step from Spool 11 (Last Use) by way of 
    an all-rows scan into Spool 1 (used to materialize view, derived 
    table, table function or table operator b) (all_amps) (compressed 
    columns allowed), which is built locally on the AMPs. The size of 
    Spool 1 is estimated with no confidence to be 36,436,341 rows (
    4,372,360,920 bytes). The estimated time for this step is 0.65 
10) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way of 
    an all-rows scan with a condition of ("(b.LOGDATE <= DATE 
    '2016-01-11') AND (b.LOGDATE >= DATE '2016-01-01')") into Spool 16 
    (all_amps) (compressed columns allowed), which is redistributed by 
    the rowkey of (PDCRDATA.DBQLObjTbl_Hst.QueryID, 
    PDCRDATA.DBQLObjTbl_Hst.LogDate) to all AMPs. Then we do a SORT 
    to partition Spool 16 by rowkey. The size of Spool 16 is 
    estimated with no confidence to be 36,436,341 rows (4,080,870,192 
    bytes). The estimated time for this step is 3.86 seconds. 
11) We do an all-AMPs JOIN step from 11 partitions of 
    PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl by way of a 
    RowHash match scan with a condition of ("(PDCRDATA.DBQLogTbl_Hst 
    in view pdcrinfo.dbqlogtbl.LogDate <= DATE '2016-01-11') AND 
    (PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl.LogDate >= DATE 
    '2016-01-01')"), which is joined to Spool 16 (Last Use) by way of 
    a RowHash match scan. PDCRDATA.DBQLogTbl_Hst and Spool 16 are 
    joined using a rowkey-based merge join, with a join condition of (
    (PDCRDATA.DBQLogTbl_Hst.LogDate = LOGDATE)"). The input table 
    PDCRDATA.DBQLogTbl_Hst will not be cached in memory, but it is 
    eligible for synchronized scanning. The result goes into Spool 15 
    (all_amps) (compressed columns allowed), which is built locally on 
    the AMPs. The size of Spool 15 is estimated with no confidence to 
    be 35,969,436 rows (7,373,734,380 bytes). The estimated time for 
    this step is 1.72 seconds. 
12) We do an all-AMPs SUM step to aggregate from Spool 15 (Last Use) 
    by way of an all-rows scan , grouping by field1 (
    ,PDCRDATA.DBQLogTbl_Hst.UserName). Aggregate Intermediate Results 
    are computed globally, then placed in Spool 17. The size of Spool 
    17 is estimated with no confidence to be 6,175,740 rows (
    4,390,951,140 bytes). The estimated time for this step is 1.61 
13) We do an all-AMPs RETRIEVE step from Spool 17 (Last Use) by way of 
    an all-rows scan into Spool 13 (group_amps), which is built 
    locally on the AMPs. Then we do a SORT to order Spool 13 by the 
    sort key in spool field1 (SUM((PDCRDATA.DBQLogTbl_Hst.AMPCPUTime 
    )))(DECIMAL(18,3)), PDCRDATA.DBQLObjTbl_Hst.ObjectDatabaseName, 
    PDCRDATA.DBQLogTbl_Hst.UserName). The size of Spool 13 is 
    estimated with no confidence to be 6,175,740 rows (3,909,243,420 
    bytes). The estimated time for this step is 0.43 seconds. 
14) Finally, we send out an END TRANSACTION step to all AMPs involved 
    in processing the request. 
    -> The contents of Spool 13 are sent back to the user as the result 
    of statement 1. The total estimated time is 16.79 seconds. 

您能否提供EXPLAIN計劃? –


感謝羅布。我剛剛發佈 – user1874594



在#2你在MIN之前的數據庫名過濾器,但在#1 後的MIN。

假設一個查詢訪問的粗體數據庫,然後在#子查詢b 1返回'Bla_DB'如MIN,而在#2返回'Fin_DB'

  • DBC
    • sysdba
      • FinDB
        • FinDB_B
        • ClaimDB_CO
        • ...
      • ...
      • Bla_DB
    • ...

TY Dieter! – user1874594


嗨Dieter(在prev。comment中大聲思考,所以剛剛在這裏更正了)。我是不是在**第二個** **我正在做FILTER 1st,然後是MIN。 'where objectdatabasename in(sel child from dbc.children where parent ='findb'group by 1) GROUP BY 1,2)''是第二個。解釋順序爲:DBC.DBase Join(&steps)Owners(step 1-6) - > Spool 7 JOIN LogObject table Locally built - > SUM(1st aggregation for MIN) - > {more steps } - > SUM(第二個是最後一個分組的?)'。所以它的第二個是過濾器,然後MIN? – user1874594


第一次是min然後過濾?這就解釋了爲什麼第二個人有更多的行,因爲在你將他們限定爲父項的子項之後,你會將它們堆疊起來以消除查詢ID欺騙。所以你想在「最終設置」上進行分組。 ?我可能是錯的......但只是檢查。 TY再次 – user1874594