2017-02-21 79 views
2

我試圖讓條目的總計數,但不幸的是我不相信匯總會是最好的選擇:SQL ROLLUP或Union?

SELECT BUSINESS_STATUS_NAME, 
    PENDING_ITEMS, 
    DATAGROUP 
FROM PAYMENTS 
WHERE STATUS LIKE '%PROCESS%'; 

這將產生:

BUSINESS_STATUS_NAME  PENDING_ITEMS  DATAGROUP 
PROCESSING DATA   34    PRODUCT 
PROCESSING INS   40    SERVICE 

我想獲得一個大總計低於,但ROLLUP給我的小計,因爲它包括數據組列。我只需要懸而未決的項目總計,但我需要顯示數據組。 UNION有總和(pending_items)選擇查詢會更好嗎?

BUSINESS_STATUS_NAME  PENDING_ITEMS  DATAGROUP 
PROCESSING DATA   34    PRODUCT 
PROCESSING INS   40    SERVICE 
GRAND TOTAL **   74 

謝謝!

+2

使用ROLLUP可獲得更好的性能。如果需要通過group by子句使用小計。請參閱http://sql-plsql.blogspot.in/2010/10/rollup。html –

回答

0

您可以使用rollup,但你需要一個聚集查詢:

SELECT BUSINESS_STATUS_NAME, 
     SUM(PENDING_ITEMS) as PENDING_ITEMS, 
     DATAGROUP 
FROM PAYMENTS 
WHERE STATUS LIKE '%PROCESS%' 
GROUP BY ROLLUP (BUSINESS_STATUS_NAME, DATAGROUP); 

我懷疑有這樣的一個union all之間的性能差異。但是,請注意,這可保證將彙總行作爲結果集中的最後一行。

+0

我相信你還需要另外幾個括號 – Aleksej

+0

union all可能需要兩次讀取基表 - 爲什麼沒有性能差異和只讀一次(使用'rollup'解決方案)? – mathguy

3

我會使用ROLLUP,爲了清晰和性能。

說你有一個這樣的示例表:

create table payments (business_status_name, pending_items, datagroup) as (
    select 'PROCESSING DATA', 10, 'PRODUCT' from dual union all 
    select 'PROCESSING DATA', 5, 'PRODUCT' from dual union all 
    select 'PROCESSING DATA', 2, 'SERVICE' from dual union all 
    select 'PROCESSING INS', 10, 'SERVICE' from dual union all 
    select 'PROCESSING INS', 10, 'SERVICE' from dual union all 
    select 'PROCESSING INS', 10, 'PRODUCT' from dual 
) 

這是ROLLUP的方式(注意括號來改變分組的邏輯):

SELECT BUSINESS_STATUS_NAME, 
     SUM(PENDING_ITEMS) as PENDING_ITEMS, 
     DATAGROUP 
FROM PAYMENTS 
GROUP BY ROLLUP ((BUSINESS_STATUS_NAME, DATAGROUP)) 

結果:

BUSINESS_STATUS PENDING_ITEMS DATAGRO 
--------------- ------------- ------- 
PROCESSING INS    10 PRODUCT 
PROCESSING INS    20 SERVICE 
PROCESSING DATA   15 PRODUCT 
PROCESSING DATA    2 SERVICE 
          47 

該計劃:

--------------------------------------------------------------------------------- 
| Id | Operation   | Name  | Rows | Bytes | Cost (%CPU)| Time  | 
--------------------------------------------------------------------------------- 
| 0 | SELECT STATEMENT  |   |  6 | 186 |  4 (25)| 00:00:01 | 
| 1 | SORT GROUP BY ROLLUP|   |  6 | 186 |  4 (25)| 00:00:01 | 
| 2 | TABLE ACCESS FULL | PAYMENTS |  6 | 186 |  3 (0)| 00:00:01 | 
--------------------------------------------------------------------------------- 

這是UNION ALL

SELECT BUSINESS_STATUS_NAME, 
     SUM(PENDING_ITEMS) as PENDING_ITEMS, 
     DATAGROUP 
FROM PAYMENTS 
GROUP BY BUSINESS_STATUS_NAME, DATAGROUP 
UNION ALL 
SELECT NULL, SUM(PENDING_ITEMS), NULL 
FROM PAYMENTS; 

結果比ROLLUP相同:

BUSINESS_STATUS PENDING_ITEMS DATAGRO 
--------------- ------------- ------- 
PROCESSING INS    20 SERVICE 
PROCESSING INS    10 PRODUCT 
PROCESSING DATA   15 PRODUCT 
PROCESSING DATA    2 SERVICE 
          47 

的計劃也不是那麼好,TWO FULL SCANS

-------------------------------------------------------------------------------- 
| Id | Operation   | Name  | Rows | Bytes | Cost (%CPU)| Time  | 
-------------------------------------------------------------------------------- 
| 0 | SELECT STATEMENT |   |  7 | 199 |  7 (58)| 00:00:01 | 
| 1 | UNION-ALL   |   |  |  |   |   | 
| 2 | HASH GROUP BY  |   |  6 | 186 |  4 (25)| 00:00:01 | 
| 3 | TABLE ACCESS FULL| PAYMENTS |  6 | 186 |  3 (0)| 00:00:01 | 
| 4 | SORT AGGREGATE |   |  1 | 13 |   |   | 
| 5 | TABLE ACCESS FULL| PAYMENTS |  6 | 78 |  3 (0)| 00:00:01 | 
-------------------------------------------------------------------------------- 

這當然只有一個有少量記錄的小示例,沒有索引,......因此,真實表格上的內容可能會有所不同,但我仍然認爲ROLLUP應該比UNION ALL更好。

在一個簡單的情況下,完全等於你的,這將是這兩種方法的計劃:

SELECT BUSINESS_STATUS_NAME, 
     SUM(PENDING_ITEMS) as PENDING_ITEMS, 
     DATAGROUP 
FROM PAYMENTS 
GROUP BY ROLLUP ((BUSINESS_STATUS_NAME, DATAGROUP)) 

--------------------------------------------------------------------------------- 
| Id | Operation   | Name  | Rows | Bytes | Cost (%CPU)| Time  | 
--------------------------------------------------------------------------------- 
| 0 | SELECT STATEMENT  |   |  2 | 62 |  4 (25)| 00:00:01 | 
| 1 | SORT GROUP BY ROLLUP|   |  2 | 62 |  4 (25)| 00:00:01 | 
| 2 | TABLE ACCESS FULL | PAYMENTS |  2 | 62 |  3 (0)| 00:00:01 | 
--------------------------------------------------------------------------------- 

SELECT BUSINESS_STATUS_NAME, 
     PENDING_ITEMS, 
     DATAGROUP 
FROM PAYMENTS 
UNION ALL 
SELECT NULL, 
     SUM(PENDING_ITEMS), 
     NULL 
FROM PAYMENTS  

-------------------------------------------------------------------------------- 
| Id | Operation   | Name  | Rows | Bytes | Cost (%CPU)| Time  | 
-------------------------------------------------------------------------------- 
| 0 | SELECT STATEMENT |   |  3 | 75 |  6 (50)| 00:00:01 | 
| 1 | UNION-ALL   |   |  |  |   |   | 
| 2 | TABLE ACCESS FULL | PAYMENTS |  2 | 62 |  3 (0)| 00:00:01 | 
| 3 | SORT AGGREGATE |   |  1 | 13 |   |   | 
| 4 | TABLE ACCESS FULL| PAYMENTS |  2 | 26 |  3 (0)| 00:00:01 | 
-------------------------------------------------------------------------------- 

ROLLUP仍然有一個表掃描一個更好的計劃。

+0

在比較計劃時需要注意的重要一點不是「成本」(應該只針對**相同**查詢的不同執行計劃進行比較,而不是針對解決相同問題的兩種不同查詢,兩者都是正確的但使用不同的方法)。需要注意的是'union all'需要兩次訪問基表** **。儘管Gordon的觀點相反(在另一個答案中),但這幾乎肯定會使'union all'查詢比'rollup'查詢更慢(並且可能慢得多)。 – mathguy

+0

感謝您的解釋。我同意你的成本,但請記住,你的結果集並不是我想要顯示數據的方式......看起來彙總需要進一步的分組,這毫無意義......我只需要盛大的所有行的總數,不需要進一步分組(考慮到初始分組已經執行)。這有意義嗎? –

+0

@Rob_E:我不明白這一點。鑑於我的樣本表,結果應該是什麼? – Aleksej