2014-10-30 107 views
2

我有一系列可能重疊的編號時間間隔。重要提示:沒有兩個間隔同時開始,開始間隔的時間是嚴格內插用於在一系列重疊(時間)間隔內查找非重疊間隔的子系列的SQL查詢

插圖:

Task 1: 1111111 
Task 2:  22222222222  
Task 3:    333333333333333 
Task 4:     444444 
Task 5:       5555555 
Task 6:         66 
    . 
    . 
    . 
     0 --- time axis ---> 

的間隔代表應該執行的任務。我正在尋找一個SQL查詢,該查詢選擇可以執行的任務,給定的約束條件是隻有一個任務可以在同一時間執行。第一項任務總是執行。接下來,從第一個任務完成後開始的所有任務開始執行最早開始的任務。等等。

結果:任務1,3和6可以執行。插圖:

Task 1: 1111111        (yes, first) 
Task 2:  -----------      (no, task 1 is running when 2 begins) 
Task 3:    333333333333333   (yes) 
Task 4:     ------    (no, task 3 is running when 4 begins) 
Task 5:       -------  (no, task 3 is running when 5 begins) 
Task 6:         66 (yes) 
    . 
    . 
    . 
     0 --- time axis ---> 

用迭代法,該算法是容易的:在一個循環迭代過以升序記憶間隔的最後一個所選間隔的末尾。但是,我想問你一個SQL查詢,可能使用窗口函數,可以執行例如。在Google BigQuery上。

的任務表的架構:

task_id: integer, 
start_timestamp: integer, 
duration_seconds: integer. 

樣本數據:

task_id,start_timestamp,duration_seconds 
1,1,7 
2,4,11 
3,12,15 
4,16,6 
5,24,7 
6,33,2 
7,37,4 
8,42,13 
9,47,3 
10,50,2 
11,54,21 
12,58,14 
13,66,8 
14,72,7 
15,80,6 
16,88,16 
17,92,14 
18,102,3 
19,109,2 
20,119,10 
21,123,13 
22,128,21 
23,138,7 
24,141,17 
25,146,9 
26,154,17 
27,160,17 
28,164,13 
29,173,21 
30,181,7 

結果 - 所選擇的任務:

1,3,6,7,8,12,14,15,16,19,20,23,25,27,30 

樣本數據的插圖:

 
Task 1: 1111111 
Task 2:  22222222222 
Task 3:    333333333333333 
Task 4:     444444 
Task 5:       5555555 
Task 6:         66 
Task 7:          7777 
Task 8:           8888888888888 
Task 9:            999 
Task 10:             10 
Task 11:              11xxxxxxxxxxxxxxxxxxx 
Task 12:               12xxxxxxxxxxxx 
Task 13:                 13xxxxxx 
Task 14:                   14xxxxx 
Task 15:                     15xxxx 
Task 16:                       16xxxxxxxxxxxxxx 
Task 17:                        17xxxxxxxxxxxx 
Task 18:                          18x 
Task 19:                            19 
Task 20:                              20xxxxxxxx 
Task 21:                               21xxxxxxxxxxx 
Task 22:                                 22xxxxxxxxxxxxxxxxxxx 
Task 23:                                   23xxxxx 
Task 24:                                    24xxxxxxxxxxxxxxx 
Task 25:                                     25xxxxxxx 
Task 26:                                       26xxxxxxxxxxxxxxx 
Task 27:                                         27xxxxxxxxxxxxxxx 
Task 28:                                          28xxxxxxxxxxx 
Task 29:                                            29xxxxxxxxxxxxxxxxxxx 
Task 30:                                              30xxxxx 

非常感謝您的幫助。

+1

你能提供你有工作的SQL模式?你有開始日期時間和持續時間嗎?或者你真的用數字開始索引和一個n位數的長字符串來表示任務持續時間? – 2014-10-30 20:06:41

+0

選擇...其中NOT EXISTS(具有重疊時間範圍的任務) – 2014-10-30 21:23:24

+0

@PatrickM我有開始時間戳和持續時間。我剛剛將模式附加到問題的文本中。 – Nathan 2014-10-30 21:28:11

回答

2

執行此操作的一種方法如下: 查找重疊任務(開始時間介於其他任務的開始時間和結束時間之間),並提取所有其他任務。

Select task_id 
FROM [table] 
where Task_id not in( 
    Select B.task_id FROM 
    (SELECT task_id, start_timestamp, duration_seconds ,start_timestamp+duration_seconds as end_timestamp 
    FROM [table]) as A 
    CROSS JOIN EACH 
    (SELECT task_id, start_timestamp, duration_seconds ,start_timestamp+duration_seconds as end_timestamp 
    FROM [table]) as B 
    where B.start_timestamp>=A.start_timestamp 
    and B.start_timestamp<A.end_timestamp 
    and A.task_id<>b.task_id) 

此解決方案未使用窗口功能。

使用窗口功能是可行的,但您必須假設並行並行作業的限制(在本例中爲3)。在這裏,我現在用的LAG窗函數找到3個先導任務,並檢查是否有一定的任務重疊其中之一(開始時間爲起始時間和分組開始任務的結束時間之間)

Select task_id 
FROM 
(Select task_id, start_timestamp, duration_seconds ,end_timestamp 
,LAG(task_id,1) OVER (ORDER BY start_timestamp) as LAG_task_id_1 
,LAG(start_timestamp,1) OVER (ORDER BY start_timestamp) as LAG_start_timestamp_1 
,LAG(duration_seconds,1) OVER (ORDER BY start_timestamp) as LAG_duration_seconds_1 
,LAG(end_timestamp,1) OVER (ORDER BY start_timestamp) as LAG_end_timestamp_1 
,LAG(task_id,2) OVER (ORDER BY start_timestamp) as LAG_task_id_2 
,LAG(start_timestamp,2) OVER (ORDER BY start_timestamp) as LAG_start_timestamp_2 
,LAG(duration_seconds,21) OVER (ORDER BY start_timestamp) as LAG_duration_seconds_2 
,LAG(end_timestamp,2) OVER (ORDER BY start_timestamp) as LAG_end_timestamp_2 
,LAG(task_id,3) OVER (ORDER BY start_timestamp) as LAG_task_id_3 
,LAG(start_timestamp,3) OVER (ORDER BY start_timestamp) as LAG_start_timestamp_3 
,LAG(duration_seconds,3) OVER (ORDER BY start_timestamp) as LAG_duration_seconds_3 
,LAG(end_timestamp,3) OVER (ORDER BY start_timestamp) as LAG_end_timestamp_3 
FROM 
(SELECT task_id, start_timestamp, duration_seconds ,start_timestamp+duration_seconds as end_timestamp 
FROM [table])) 
where 
(NOT(start_timestamp>=LAG_start_timestamp_1 and start_timestamp<LAG_end_timestamp_1) 
and NOT(start_timestamp>=LAG_start_timestamp_2 and start_timestamp<LAG_end_timestamp_2) 
and NOT(start_timestamp>=LAG_start_timestamp_3 and start_timestamp<LAG_end_timestamp_3)) 
OR LAG_start_timestamp_1 IS NULL 

希望這有助於。 ..

+0

謝謝你的回答。如果我理解得好,它會發現不重疊的任務。但是,它不能以期望的方式處理任務的順序。例如_task 3_與_task 2_重疊,但它仍然是結果集的一部分。原因是,_task 2_不是結果集的一部分(當_task 2_開始時_task 1_正在運行),所以在決定是否可以執行_task 3_時,它是無關緊要的。我意識到這個問題的困難。如果存在SQL解決方案,則必須以某種方式建模接受任務的阻止行爲。 – Nathan 2014-10-31 22:49:58

+0

如果存在併發任務的限制,可以嘗試使用LAG函數添加驗證以前任務的有效性的條件。 – 2014-11-01 05:22:39

0

剛寫了一些非常相關的東西(在oracle中),也許它可以幫助某人。 我發現我的解決方案http://technology.amis.nl/2007/05/07/creating-a-gantt-chart-in-sql/

下面是幫助說明甘特圖的代碼。

WITH 
PERIODS AS 
    (SELECT TASK_ID LABEL ,  
      START_TIMESTAMP START_DATE ,  
      START_TIMESTAMP+DURATION_SECONDS END_DATE 
    FROM testet 
), 
LIMITS AS -- determine the earliest starting date and the latest end date to determine the overall width of the chart 
    (SELECT MIN(START_DATE) PERIOD_START ,  
      MAX(END_DATE) PERIOD_END ,  
      80 WIDTH -- set the width as the number of characters 
    FROM PERIODS 
), 
BARS AS 
    (SELECT LPAD(LABEL, '20')||'|' ACTIVITY ,   
      (START_DATE - PERIOD_START)/(PERIOD_END - PERIOD_START) * WIDTH FROM_POS, -- the starting position for the bar   
      (END_DATE - PERIOD_START)/(PERIOD_END - PERIOD_START) * WIDTH TO_POS -- the end position for the bar 
    FROM  PERIODS ,  LIMITS 
) 
SELECT ACTIVITY||   
     LPAD('I',FROM_POS)   || 
     RPAD('-', TO_POS - FROM_POS, '-')   || 
     'I' GANTT 
FROM  BARS 
UNION ALL 
SELECT RPAD('_',WIDTH + 22,'_') 
FROM LIMITS 
UNION ALL 
SELECT LPAD('|',21)  || 
     PERIOD_START  || 
     LPAD(PERIOD_END, 
     WIDTH - 11) 
FROM LIMITS; 

輸出:

    1|--I 
        2|I----I 
        3| I------I 
        4|  I--I 
        5|  I--I 
        6|   II 
        7|    I-I 
        8|    I-----I 
        9|     I-I 
        10|     II 
        11|     I--------I 
        12|      I-----I 
        13|       I---I 
        14|       I--I 
        15|        I--I 
        16|         I------I 
        17|         I-----I 
        18|          I-I 
        19|           II 
        20|            I----I 
        21|             I-----I 
        22|             I--------I 
        23|              I--I 
        24|               I-------I 
        25|               I---I 
        26|                I-------I 
        27|                I-------I 
        28|                 I-----I 
        29|                  I--------I 
        30|                   I--I 
______________________________________________________________________________________________________ 
        |1                 194