2010-04-19 104 views
7

我有一組數據,告訴我是否有幾個系統可用或不是每增加5或15分鐘。現在,時間增量應該不重要。查找連續行和計算持續時間

的數據是這樣的:

Status  Time   System_ID 
T   10:00   S01 
T   10:15   S01 
F   10:30   S01 
F   10:45   S01 
F   11:00   S01 
T   11:15   S01 
T   11:30   S01 
F   11:45   S01 
F   12:00   S01 
F   12:15   S01 
T   12:30   S01 

F   10:00   S02 
F   10:15   S02 
F   10:30   S02 
F   10:45   S02 
F   11:00   S02 
T   11:15   S02 
T   11:30   S02 

我想創建告訴視圖當系統不可用(即當它是F),從什麼時候開始,到什麼時間,以及持續時間是 - 從。

期望的結果:

System_ID From   To   Duration 
S01   10:30   11:00   00:30 
S01   11:45   12:15   00:30 
S02   10:00   11:00   01:00 

這裏是腳本數據:

DROP SCHEMA IF EXISTS Sys_data CASCADE; 
CREATE SCHEMA Sys_data; 

CREATE TABLE test_data (
      status BOOLEAN, 
      dTime TIME, 
      sys_ID VARCHAR(10), 
      PRIMARY KEY (dTime, sys_ID) 
); 

INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '10:00:00', 'S01'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '10:15:00', 'S01'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:30:00', 'S01'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:45:00', 'S01'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '11:00:00', 'S01'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '11:15:00', 'S01'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '11:30:00', 'S01'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '11:45:00', 'S01'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '12:00:00', 'S01'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '12:15:00', 'S01'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '12:30:00', 'S01'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:00:00', 'S02'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:15:00', 'S02'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:30:00', 'S02'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '10:45:00', 'S02'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (FALSE, '11:00:00', 'S02'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '11:15:00', 'S02'); 
INSERT INTO test_data (status, dTime, sys_ID) VALUES (TRUE, '11:30:00', 'S02'); 

預先感謝您!

+1

不會你想查詢從T後的第一個F到下一個T?該系統不一定在序列中的最後一個F和下一個T之間可用。 – 2010-04-19 08:11:45

+0

您是對的。它應該是下一個T. – MannyKo 2010-04-19 09:17:21

回答

2

也許不是最佳的,但它的工作原理:)

select sys_id, first_time as down_from, max(dTime) as down_to 
from (select status, sys_id, dTime, 
      (select min(td_add2.dTime) 
       from test_data td_add2 
       where td_add2.dtime <= x.dTime 
       and td_add2.dtime >= COALESCE(x.prev_time,x.min_time) 
       and td_add2.status = x.status  
       and td_add2.sys_id = x.sys_id) as first_time 
     from (select td_main.status, td_main.sys_id, td_main.dTime,  
           (select max(td_add.dTime) 
            from test_data td_add 
            where td_add.dtime < td_main.dTime 
            and td_add.status != td_main.status  
            and td_add.sys_id = td_main.sys_id) as prev_time, 
           (select min(td_add.dTime) 
            from test_data td_add 
            where td_add.dtime < td_main.dTime 
            and td_add.sys_id = td_main.sys_id) as min_time                          
       from test_data td_main) x 
    ) y 
where status = false 
and first_time is not null 
group by sys_id, first_time 
order by sys_id, first_time 
+--------+-----------+----------+ 
| sys_id | down_from | down_to | 
+--------+-----------+----------+ 
| S01 | 10:30:00 | 11:00:00 | 
| S01 | 11:45:00 | 12:15:00 | 
| S02 | 10:00:00 | 11:00:00 | 
+--------+-----------+----------+ 
3 rows in set (0.00 sec) 
+0

用於測試解決方案的+1(次要提示:order by是多餘的;「如果使用GROUP BY,則輸出行根據GROUP BY列排序,就好像您對同一列有ORDER BY一樣。」) – Unreason 2010-04-19 13:02:21

+0

我不'不知道MySQL的工作原理如此奇怪:)。 PostgreSQL和Oracle在使用GROUP BY時不擔保排序。在GROUP BY中排序是一個副作用。 – 2010-04-19 13:35:20

+0

非常感謝!這工作! – MannyKo 2010-04-20 04:57:33

0

稍微長一點,但似乎在PostgreSQL中工作。基本原理:

  1. 找到倍,其中系統狀態改變
  2. 只得到第一次也是最後一次 - 在最後的狀態是不同的,一個狀態將是不同的(或沒有)
  3. 計算差異

下面是代碼:

SELECT sys_id, 
    status, 
    coalesce(end_time, end_time2) - start_time duration 
FROM (
SELECT sys_id, status, start_time, end_time, 
lead(end_time) over (partition by sys_id order by dtime) end_time2 
FROM ( 
    SELECT sys_id, status, dtime, start_time, end_time 
    FROM (
     SELECT sys_id, status, dtime, 
     CASE WHEN last_status != status OR last_status IS NULL THEN dtime ELSE NULL END start_time, 
     CASE WHEN next_status != status OR next_status IS NULL THEN dtime ELSE NULL END end_time 
     FROM (
     SELECT sys_id, status, dtime, 
      LAG(status) OVER (PARTITION BY sys_id ORDER BY sys_id, dtime) last_status, 
      LEAD(status) OVER (PARTITION BY sys_id ORDER BY sys_id, dtime) next_status 
      FROM test_data 
      ORDER BY sys_id, dtime 
     ) surrounding_status 
    ) last_next_times 

    WHERE start_time IS NOT NULL OR end_time IS NOT NULL 
    ORDER BY sys_id, dtime 
) start_end_times 
) find_last_time 
WHERE start_time IS NOT NULL AND status = FALSE 
ORDER BY sys_id, start_time; 

這僅僅是簡單的代碼,有可能很簡單我想的解決方案。

+0

哦,我很抱歉,我忽略了mysql標籤。據我所知,這在MySQL中不起作用,因爲它沒有分析/窗口功能。 – Stiivi 2010-04-19 08:08:02

1

這裏是基於光標的解決方案,我不知道MySQL是否支持分區通過光標的原因。這在2008年SQL過測試,它的工作原理,希望它工作在MySQL,但至少它會給你一個想法

CREATE TABLE #offline_data 
    (
    dTime DATETIME 
    ,sys_ID VARCHAR(50) 
    ,GroupID INTEGER 
    ) 


DECLARE @status BIT 
DECLARE @dTime DATETIME 
DECLARE @sys_ID VARCHAR(50) 

DECLARE @GroupID INTEGER = 0 


DECLARE test_cur CURSOR 
FOR SELECT 
[status] 
,[dTime] 
,[sys_ID] 
FROM 
[dbo].[test_data] 

OPEN test_cur 
FETCH NEXT FROM test_cur INTO @status, @dTime, @sys_ID 

WHILE @@FETCH_STATUS = 0 
    BEGIN 

     IF @status = 0 
      INSERT [#offline_data] 
        ([dTime] , [sys_ID] , [GroupID]) 
      VALUES 
        (@dTime , @sys_ID , @GroupID) 
     ELSE 
      SET @GroupID += 1 

     FETCH NEXT FROM test_cur INTO @status, @dTime, @sys_ID 
    END 

CLOSE test_cur 
DEALLOCATE test_cur 

SELECT 
    [sys_ID] 'SYSTEM_ID' 
    ,CONVERT(VARCHAR(8) , MIN([dTime]) , 108) 'FROM' 
    ,CONVERT(VARCHAR(8) , MAX([dTime]) , 108) 'TO' 
    ,CONVERT(VARCHAR(8) , DATEADD(mi , DATEDIFF(mi , MIN([dTime]) , MAX([dTime])) , '1900-01-01T00:00:00.000') , 108) 'DURATION' 
FROM 
    #offline_data 
GROUP BY 
    [sys_ID] 
    ,[GroupID]