2016-06-08 67 views
1

我一直在努力編寫一個查詢,命中我們公司的超大型數據庫,以便爲客戶撤回最大計費金額(本例中爲A和B)。我們希望爲過去一個月的每位客戶提供最大A/B,併爲過去一年最大A/B提供最大A/B。由於大數據集和負值,Oracle查詢運行非常緩慢

我們在我們的賬單數據庫中注意到的一個問題是它存儲「取消」賬單的方式。它通過將第一個計費記錄的第二個負數版本添加到計費表中來實現。像這樣:

enter image description here

在這種情況下,41040是不正確的法案,因此加入該紀錄的負版本。但是,當我試圖選擇此列上的最大值時,我仍然會返回41040而不是正確的計費值50.此表似乎不會以任何方式標記這些不正確的帳單,這些帳單會使它們變得輕鬆過濾掉。

我現在的解決方案是將ID列的最大值作爲正確的賬單。這使得假設一個月輸入的最終賬單是正確的。

這似乎會恢復正確的數據,但查詢在大型數據集上運行速度令人難以置信,而且我沒有對此表的寫入權限來添加或查看索引。總共有98,007,807行和1,596,491個獨立客戶,並且無論如何優化查詢以改善性能?

select mth.KY_CUSTOMER_NO,max(QY_MTH_BILLED_A) as QY_MTH_BILLED_A, max(QY_MTH_B) as QY_MTH_BILLING_B, max.MAX_BILLING_A, max.MAX_BILLING_B 
from (
    --Get the max A/B values for the past month 
    select m.* 
    from CUSTOMER_USAGE m 
    where rev_year = to_number(to_char(sysdate,'yyyy')) 
    and rev_mth in (to_number(to_char(add_months(sysdate, -1), 'mm')),to_number(to_char(sysdate,'mm'))) 
    and ID in (select max(ID) from CUSTOMER_USAGE where KY_CUSTOMER_NO = m.KY_CUSTOMER_NO group by rev_mth, rev_year) 
) mth join 
(
    --Get the max A/B values for the past year 
    select KY_CUSTOMER_NO, max(QY_MTH_B) as MAX_BILLING_B, max(QY_MTH_BILLED_A) as MAX_BILLING_A from CUSTOMER_USAGE m 
    where DT_ADDED > current_timestamp - 365 ID in (select max(ID) from CUSTOMER_USAGE where KY_CUSTOMER_NO = m.KY_CUSTOMER_NO group by rev_mth, rev_year) 
    group by KY_CUSTOMER_NO 
) max on mth.KY_CUSTOMER_NO = max.KY_CUSTOMER_NO 
group by mth.KY_CUSTOMER_NO, max.MAX_BILLING_KVA, max.MAX_BILLING_KW 
+0

什麼指數存在和什麼是當前的查詢計劃? –

回答

1

分析函數似乎是解決方案。

我已經省去了WHERE子句,因爲它們不是您的示例數據所必需的,但您應該能夠將它們添加回到最內層的內聯視圖。您也可以使用EXTRACT(YEAR FROM SYSDATE)而不是將字符串轉換爲字符串。

甲骨文設置

CREATE TABLE customer_usage (id, ky_customer_no, rev_mth, rev_year, qy_mth_billed_a, qy_mth_billed_b) AS 
SELECT 1, 1, 1, 2016, 41040, 0 FROM DUAL UNION ALL 
SELECT 2, 1, 1, 2016, -41040, 0 FROM DUAL UNION ALL 
SELECT 3, 1, 1, 2016,  50, 0 FROM DUAL UNION ALL 
SELECT 4, 1, 1, 2016,  0, 0 FROM DUAL; 

查詢

SELECT id, 
     ky_customer_no, 
     rev_mth, 
     rev_year, 
     qy_mth_billed_a, 
     qy_mth_billed_b 
FROM (
    SELECT c.*, 
     ROW_NUMBER() 
      OVER (PARTITION BY ky_customer_no, rev_year, rev_mth 
        ORDER BY total_mth_billed_a DESC) AS rn 
    FROM (
    SELECT c.*, 
      SUM(qy_mth_billed_a) 
      OVER (PARTITION BY ky_customer_no, rev_year, rev_mth, ABS(qy_mth_billed_a) 
        ORDER BY id DESC) AS total_mth_billed_a    
    FROM customer_usage c 
) c 
) 
WHERE rn = 1; 

輸出

 ID KY_CUSTOMER_NO REV_MTH REV_YEAR QY_MTH_BILLED_A QY_MTH_BILLED_B 
---------- -------------- ---------- ---------- --------------- --------------- 
     3    1   1  2016    50    0 
0

我試過肛其他方法,但使用大多數@ MT0設置。

CREATE TABLE customer_usage (id, ky_customer_no, rev_mth, rev_year, qy_mth_billed_a, qy_mth_billed_b) AS 
SELECT 1, 1, 1, 2016, 41040, 0 FROM DUAL UNION ALL 
SELECT 2, 1, 1, 2016, -41040, 0 FROM DUAL UNION ALL 
SELECT 3, 1, 1, 2016,  50, 0 FROM DUAL UNION ALL 
SELECT 4, 1, 1, 2016,  0, 0 FROM DUAL; 

因爲我們想擺脫它的ABS(這些值)相等,但有不同的符號我嘗試這樣做:

SELECT c.KY_CUSTOMER_NO, c.REV_MTH, c.REV_YEAR, max(qy_mth_billed_a) as qy_mth_billed_a , max(QY_MTH_BILLED_B) as qy_mth_billed_b 
    FROM (
    SELECT c.*, 
      max(qy_mth_billed_a) 
      OVER (PARTITION BY ky_customer_no, rev_year, rev_mth,ABS(qy_mth_billed_a)) AS max_mth_billed_a, 
      min(qy_mth_billed_a) 
      OVER (PARTITION BY ky_customer_no, rev_year, rev_mth,ABS(qy_mth_billed_a)) AS min_mth_billed_a  
    FROM customer_usage c 
) c where max_mth_billed_a+min_mth_billed_a!=0 
group by c.KY_CUSTOMER_NO, c.REV_MTH, c.REV_YEAR; 

是相同的,因爲你所面對的一些輸出性能問題,我想嘗試這兩種方法:

KY_CUSTOMER_NO REV_MTH REV_YEAR qy_mth_billed_a qy_mth_billed_b 
1 1 1 2016 50 0 

編輯 其實如果算上不同的SI每個abs的值(價值),這是一個奇數,我認爲它會更快的工作(只需一個窗口功能)

SELECT c.KY_CUSTOMER_NO, c.REV_MTH, c.REV_YEAR, max(qy_mth_billed_a) as qy_mth_billed_a , max(QY_MTH_BILLED_B) as qy_mth_billed_b 
    FROM (
    SELECT c.*, 
      count(sign(qy_mth_billed_a)) 
      OVER (PARTITION BY ky_customer_no, rev_year, rev_mth,ABS(qy_mth_billed_a)) AS signo  
    FROM customer_usage c 
) c where mod(signo,2) =1 
group by c.KY_CUSTOMER_NO, c.REV_MTH, c.REV_YEAR