2013-10-14 59 views
2

我有兩個表。 首先包含一些激活,其次包含一些停用。兩個表之間的增量項相關聯

我有一個去激活使用如下因素的規則只有一個激活相關聯:

  • 激活必須先停用,但不會早於 92天。
  • 與已經關聯的激活無法再次關聯 。

因此,使用一些數據:

--a activations, b - deactivations 
create table a (id1 integer, date1 date); 
create table b (id2 integer, date2 date); 

insert into a values (1, '1-Feb-2013'); 
insert into a values (2, '2-Feb-2013'); 
insert into a values (3, '3-Feb-2013'); 
insert into a values (4, '1-Mar-2013'); 
insert into a values (5, '2-Mar-2013'); 
insert into a values (6, '1-May-2013'); 
insert into a values (7, '19-May-2013'); 

insert into b values (1, '1-May-2013'); 
insert into b values (2, '1-May-2013'); 
insert into b values (3, '15-May-2013'); 
insert into b values (4, '16-May-2013'); 
insert into b values (5, '17-May-2013'); 
insert into b values (6, '18-May-2013'); 

所需的輸出:

id1 date1       id2  date2       
1 February, 01 2013 00:00:00+0000 1 May, 01 2013 00:00:00+0000 1 1 
2 February, 02 2013 00:00:00+0000 2 May, 01 2013 00:00:00+0000 2 2 
4 March, 01 2013 00:00:00+0000 3 May, 15 2013 00:00:00+0000 4 3 
5 March, 02 2013 00:00:00+0000 4 May, 16 2013 00:00:00+0000 5 4 
6 May, 01 2013 00:00:00+0000  5 May, 17 2013 00:00:00+0000 6 5 

查詢產生的候選人將是:

select id1, date1, id2, date2 
from a 
join b 
on a.date1 >= b.date2 - 91 
and b.date2 >= a.date1; 

我成功創建正確的查詢使用連接,但速度太慢(我有數以百萬計的客戶端每個客戶端都有數以千計的激活和停用設備。這個例子是一個客戶端。)

with chrn as 
(
select id1, date1, id2, date2, 
     dense_rank() over ( order by date1, id1) as act_ord, 
     dense_rank() over (order by date2, id2) as deact_ord 
from a 
join b 
on a.date1 >= b.date2 - 91 
and b.date2 >= a.date1 
) 
select * 
from (
    select s.*, row_number() over (partition by lvl order by act_ord+deact_ord) as rnk 
    from (
     select a1.*, level lvl 
     from chrn a1 
     connect by 
     prior deact_ord < deact_ord and 
     prior act_ord < act_ord and 
     (prior deact_ord = deact_ord - 1 or prior act_ord = act_ord - 1) 

     start with deact_ord = 1 and act_ord = 1 
)s 
)where rnk =1 
; 

see sqlfiddle

我想找到這個一個更快的解決方案,也許只使用分析功能。由於候選和路徑的數量很大,遞歸查詢速度太慢。或者我沒有成功減少候選人和路徑的數量。

+0

沒有ID必須與激活和失活的關係船什麼??? –

+0

不,它只是我例子中的一行標識符。我有另一把鑰匙。 –

+0

首先你不使用分區,請使用Rank而不是密集的排名,兩者都會給出相同的結果,但排名會有25%的好處。查看查詢進一步的變化,你試圖改變這個 –

回答

1

您的需求不能隨着記錄數量的增加而良好地擴展,因爲必須找到所有先前的對才能找到下一對。

當然,只要你一次只做一次,沒有辦法解決這個問題。但是,如果你不得不經常發現新的對,我強烈建議增加一個deact_idtable1

create or replace trigger BI_B after insert on B for each row 
begin 
    for c in 
    (select rowid 
    from A 
    where date1 >= :new.date2 - 91 
     and date1  < :new.date2 
     and deact_id is null 
    order by date1 
    ) 
    loop 
    update A 
    set deact_id = :new.id2 
    where rowid = c.rowid; 

    exit; 
    end loop; 
end; 
+0

有趣的ideea。實際上,我不能改變事實表格,但是這是以編程方式完成這個工作的理想工具,而不使用純粹的sql。對於每個新的斷開連接,我都應該找到尚未分配的第一個激活。 –

0

試試這個:

CREATE TABLE A (ID1 INTEGER, 
       DATE1 DATE); 

CREATE TABLE B (ID2 INTEGER, 
       DATE2 DATE); 

INSERT INTO 
     A 
VALUES 
     (1, 
     '1-Feb-2013'); 

INSERT INTO 
     A 
VALUES 
     (2, 
     '2-Feb-2013'); 

INSERT INTO 
     A 
VALUES 
     (3, 
     '3-Feb-2013'); 

INSERT INTO 
     A 
VALUES 
     (4, 
     '1-Mar-2013'); 

INSERT INTO 
     A 
VALUES 
     (5, 
     '2-Mar-2013'); 

INSERT INTO 
     A 
VALUES 
     (6, 
     '1-May-2013'); 

INSERT INTO 
     A 
VALUES 
     (7, 
     '19-May-2013'); 

INSERT INTO 
     B 
VALUES 
     (1, 
     '1-May-2013'); 

INSERT INTO 
     B 
VALUES 
     (2, 
     '1-May-2013'); 

INSERT INTO 
     B 
VALUES 
     (3, 
     '15-May-2013'); 

INSERT INTO 
     B 
VALUES 
     (4, 
     '16-May-2013'); 

INSERT INTO 
     B 
VALUES 
     (5, 
     '17-May-2013'); 

INSERT INTO 
     B 
VALUES 
     (6, 
     '18-May-2013'); 

COMMIT; 

BEGIN 
    DBMS_STATS.SET_TABLE_STATS (OWNNAME  => 'REALSPIRITUALS', 
          TABNAME => 'A', 
          NUMROWS => 100000000 ); 
END; 
/

BEGIN 
    DBMS_STATS.SET_TABLE_STATS (OWNNAME  => 'REALSPIRITUALS', 
          TABNAME => 'B', 
          NUMROWS => 100000000 ); 
END; 
/

您的查詢

SET AUTOTRACE ON 

WITH CHRN 
    AS (SELECT 
      ID1, 
      DATE1, 
      ID2, 
      DATE2, 
      DENSE_RANK () 
       OVER (ORDER BY 
          DATE1, 
          ID1) 
       AS ACT_ORD, 
      DENSE_RANK () 
       OVER (ORDER BY 
          DATE2, 
          ID2) 
       AS DEACT_ORD 
     FROM 
       A 
      JOIN 
       B 
      ON A.DATE1 >= B.DATE2 
         - 91 
       AND B.DATE2 >= A.DATE1) 
SELECT 
     * 
FROM 
     (SELECT 
      S.*, 
      ROW_NUMBER () 
       OVER (PARTITION BY LVL 
         ORDER BY 
          ACT_ORD 
          + DEACT_ORD) 
       AS RNK 
     FROM 
      (SELECT 
        A1.*, 
        LEVEL LVL 
      FROM 
        CHRN A1 
      CONNECT BY 
         PRIOR DEACT_ORD < DEACT_ORD 
        AND PRIOR ACT_ORD < ACT_ORD 
        AND (PRIOR DEACT_ORD = DEACT_ORD 
             - 1 
         OR PRIOR ACT_ORD = ACT_ORD 
             - 1) 
      START WITH 
        DEACT_ORD = 1 
        AND ACT_ORD = 1) S) 
WHERE 
     RNK = 1; 

經由CBO查詢:

 ID1 DATE1   ID2 DATE2  ACT_ORD DEACT_ORD  LVL  RNK 
---------- --------- ---------- --------- ---------- ---------- ----------  ---------- 
     1 01-FEB-13   1 01-MAY-13   1   1   1   1 
     2 02-FEB-13   2 01-MAY-13   2   2   2   1 
     4 01-MAR-13   3 15-MAY-13   4   3   3   1 
     5 02-MAR-13   4 16-MAY-13   5   4   4   1 
     6 01-MAY-13   5 17-MAY-13   6   5   5   1 

5 rows selected. 

Execution Plan 
---------------------------------------------------------- 
    0  SELECT STATEMENT Optimizer Mode=ALL_ROWS (Cost=16 G Card=25000 G Bytes=2235174G) 
    1 0 TEMP TABLE TRANSFORMATION 
    2 1  LOAD AS SELECT 
    3 2  WINDOW SORT (Cost=7 G Card=25000 G Bytes=1024454G) 
    4 3   WINDOW SORT (Cost=7 G Card=25000 G Bytes=1024454G) 
    5 4   MERGE JOIN (Cost=2 G Card=25000 G Bytes=1024454G) 
    6 5    SORT JOIN (Cost=667123 Card=100 M Bytes=2G) 
    7 6    TABLE ACCESS FULL SRINIV.A (Cost=770 Card=100 M Bytes=2G) 
    8 5    FILTER 
    9 8    SORT JOIN (Cost=667123 Card=100 M Bytes=2G) 
    10 9     TABLE ACCESS FULL SRINIV.B (Cost=770 Card=100 M Bytes=2G) 
    11 1  VIEW (Cost=9 G Card=25000 G Bytes=2235174G) 
    12 11  WINDOW SORT PUSHED RANK (Cost=9 G Card=25000 G Bytes=1932494G) 
    13 12   VIEW (Cost=887 M Card=25000 G Bytes=1932494G) 
    14 13   CONNECT BY NO FILTERING WITH START-WITH 
    15 14    COUNT 
    16 15    VIEW (Cost=887 M Card=25000 G Bytes=1629814G) 
    17 16     TABLE ACCESS FULL SYS.SYS_TEMP_0FD9D6820_3AD00CE0 (Cost=887 M Card=25000 G Bytes=1024454G) 


Statistics 
---------------------------------------------------------- 
      2 recursive calls 
      0 spare statistic 3 
      0 gcs messages sent 
      7 db block gets from cache 
      0 physical reads direct (lob) 
      0 queue position update 
      0 queue single row 
      0 queue ocp pages 
      0 HSC OLTP Compressed Blocks 
      0 HSC IDL Compressed Blocks 
      5 rows processed 

新建查詢

SET AUTOTRACE ON 


WITH CHRN 
    AS (SELECT 
      ID1, 
      DATE1, 
      ID2, 
      DATE2, 
      RANK () 
       OVER (ORDER BY 
          DATE1, 
          ID1) 
       AS ACT_ORD, 
      RANK () 
       OVER (ORDER BY 
          DATE2, 
          ID2) 
       AS DEACT_ORD 
     FROM 
      A, 
      B 
     WHERE 
      DATE2 
      - DATE1 < 92 
      AND ID1 = ID2) 
SELECT 
     * 
FROM 
     (SELECT 
      S.*, 
      ROW_NUMBER () 
       OVER (PARTITION BY LVL 
         ORDER BY 
          ACT_ORD 
          + DEACT_ORD) 
       AS RNK 
     FROM 
      (SELECT 
        A1.*, 
        LEVEL LVL 
      FROM 
        CHRN A1 
      CONNECT BY 
         PRIOR DEACT_ORD < DEACT_ORD 
        AND PRIOR ACT_ORD < ACT_ORD 
        AND (PRIOR DEACT_ORD = DEACT_ORD 
             - 1 
         OR PRIOR ACT_ORD = ACT_ORD 
             - 1) 
      START WITH 
        DEACT_ORD = 1 
        AND ACT_ORD = 1) S) 
WHERE 
     RNK = 1; 

通過CBO在新的查詢狀況:

 ID1 DATE1   ID2 DATE2  ACT_ORD DEACT_ORD  LVL  RNK 
---------- --------- ---------- --------- ---------- ---------- ----------  ---------- 
     1 01-FEB-13   1 01-MAY-13   1   1   1   1 
     2 02-FEB-13   2 01-MAY-13   2   2   2   1 
     4 01-MAR-13   3 15-MAY-13   4   3   3   1 
     5 02-MAR-13   4 16-MAY-13   5   4   4   1 
     6 01-MAY-13   5 17-MAY-13   6   5   5   1 

5 rows selected. 



Execution Plan 
---------------------------------------------------------- 
0  SELECT STATEMENT Optimizer Mode=ALL_ROWS (Cost=538808 Card=5 M Bytes=457 M) 
1 0 TEMP TABLE TRANSFORMATION 
2 1  LOAD AS SELECT 
3 2  WINDOW SORT (Cost=436441 Card=5 M Bytes=209 M) 
4 3   WINDOW SORT (Cost=436441 Card=5 M Bytes=209 M) 
5 4   HASH JOIN (Cost=324556 Card=5 M Bytes=209 M) 
6 5    TABLE ACCESS FULL REALSPIRITUALS.A (Cost=770 Card=100 M Bytes=2G) 
7 5    TABLE ACCESS FULL REALSPIRITUALS.B (Cost=770 Card=100 M Bytes=2G) 
8 1  VIEW (Cost=102367 Card=5 M Bytes=457 M) 
9 8  WINDOW SORT PUSHED RANK (Cost=102367 Card=5 M Bytes=395 M) 
10 9   VIEW (Cost=5816 Card=5 M Bytes=395 M) 
11 10   CONNECT BY NO FILTERING WITH START-WITH 
12 11    COUNT 
13 12    VIEW (Cost=5816 Card=5 M Bytes=333 M) 
14 13     TABLE ACCESS FULL SYS.SYS_TEMP_0FD9D6822_3AD00CE0 (Cost=5816 Card=5 M Bytes=209 M) 

Statistics 
---------------------------------------------------------- 
     2 recursive calls 
     0 spare statistic 3 
     0 gcs messages sent 
     7 db block gets from cache 
     0 physical reads direct (lob) 
     0 queue position update 
     0 queue single row 
     0 queue ocp pages 
     0 HSC OLTP Compressed Blocks 
     0 HSC IDL Compressed Blocks 
5 rows processed 
+0

對不起,但這不會將任何內容分配給deact no 3(因爲deact_ord = act_ord)。正確的算法將匹配deact no 3和act no 4.請參閱[this fiddle](http://sqlfiddle.com/#!4/816d8/21) –

+0

謝謝...我明白了 – SriniV