2014-01-28 35 views
2

我正在處理傳感器測量結果的時間序列數據。我需要確定數據不平坦的情況 - 表明傳感器故障。我想選擇最近24小時內連續超過3個值的地方。MSSQL:確定時間序列中的一系列不變(扁平線)值

我想我可能需要循環,但我沒有在SQL中的循環工作。我假設我需要使用子查詢來訂購日期時間。我也看過LEAD和LAG。此外,我需要通過SiteID和VariableID進行區分,我認爲這可以通過PARTITION完成。

數據是這樣的:

**SiteID**VariableID**DateTime**Value** 
    5 1 2014-01-27 12:15 5.576 
    5 1 2014-01-27 12:30 5.487 
    5 1 2014-01-27 12:45 5.573 
    5 1 2014-01-27 13:00 5.903 
    5 87 2014-01-27 12:15 -273.2 
    5 87 2014-01-27 12:30 -273.2 
    5 87 2014-01-27 12:45 -273.2 
    5 87 2014-01-27 13:00 -273.2 
    5 88 2014-01-27 12:15 -273.2 
    5 88 2014-01-27 12:30 -273.2 
    5 88 2014-01-27 12:45 -273.2 
    5 88 2014-01-27 13:00 -273.2 
    5 89 2014-01-27 12:15 -273.2 
    5 89 2014-01-27 12:30 -273.2 
    5 89 2014-01-27 12:45 -273.2 
    5 89 2014-01-27 13:00 -273.2 
    5 2 2014-01-27 12:15 30.61 
    5 2 2014-01-27 12:30 38.73 
    5 2 2014-01-27 12:45 32.84 
    5 2 2014-01-27 13:00 31.62 
    5 3 2014-01-27 12:15 -9.53 
    5 3 2014-01-27 12:30 -8.61 
    5 3 2014-01-27 12:45 -8.76 
    5 3 2014-01-27 13:00 -9.32 
    5 4 2014-01-27 12:15 0.298 
    5 4 2014-01-27 12:30 0.32 
    5 4 2014-01-27 12:45 0.317 
    5 4 2014-01-27 13:00 0.302 

我想產生類似:

**SiteID**VariableID**StartingDateTime**ValueCount**Value** 
    5   87  2014-1-27 12:15  4   -273.4 
    5   88  2014-1-27 12:15  4   -273.4 
    5   89  2014-1-27 12:15  4   -273.4 

回答

1

SQL Fiddle

使用這個架構和數據(略作修改,只是爲了確保一切正常) :

CREATE TABLE TimeSeries (
    SiteId INT, 
    VariableId INT, 
    DateTime DATETIME, 
    Value NUMERIC(15,5) 
); 

INSERT INTO TimeSeries VALUES ( 5, 1 , '2014-01-27 12:15' , 5.576 ); 
INSERT INTO TimeSeries VALUES ( 5, 1 , '2014-01-27 12:30' , 5.487 ); 
INSERT INTO TimeSeries VALUES ( 5, 1 , '2014-01-27 12:45' , 5.573 ); 
INSERT INTO TimeSeries VALUES ( 5, 1 , '2014-01-27 13:00' , 5.903 ); 
INSERT INTO TimeSeries VALUES ( 5, 87 , '2014-01-27 12:15' , -273.2 ); 
INSERT INTO TimeSeries VALUES ( 5, 87 , '2014-01-27 12:30' , -273.2 ); 
INSERT INTO TimeSeries VALUES ( 5, 87 , '2014-01-27 12:45' , -273.2 ); 
INSERT INTO TimeSeries VALUES ( 5, 87 , '2014-01-27 13:00' , -273.2 ); 
INSERT INTO TimeSeries VALUES ( 5, 88 , '2014-01-27 12:15' , -273.2 ); 
INSERT INTO TimeSeries VALUES ( 5, 88 , '2014-01-27 12:30' , -273.2 ); 
INSERT INTO TimeSeries VALUES ( 5, 88 , '2014-01-27 12:45' , -273.2 ); 
INSERT INTO TimeSeries VALUES ( 5, 88 , '2014-01-27 13:00' , -273.2 ); 
INSERT INTO TimeSeries VALUES ( 5, 89 , '2014-01-27 12:15' , -273.2 ); 
INSERT INTO TimeSeries VALUES ( 5, 89 , '2014-01-27 12:30' , -273.2 ); 
INSERT INTO TimeSeries VALUES ( 5, 89 , '2014-01-27 12:45' , -273.2 ); 
INSERT INTO TimeSeries VALUES ( 5, 89 , '2014-01-27 13:00' , -273.2 ); 
INSERT INTO TimeSeries VALUES ( 5, 2 , '2014-01-27 12:15' , 30.61 ); 
INSERT INTO TimeSeries VALUES ( 5, 2 , '2014-01-27 12:30' , 38.73 ); 
INSERT INTO TimeSeries VALUES ( 5, 2 , '2014-01-27 12:45' , 32.84 ); 
INSERT INTO TimeSeries VALUES ( 5, 2 , '2014-01-27 13:00' , 31.62 ); 
INSERT INTO TimeSeries VALUES ( 5, 3 , '2014-01-27 12:15' , -9.53 ); 
INSERT INTO TimeSeries VALUES ( 5, 3 , '2014-01-27 12:30' , -8.61 ); 
INSERT INTO TimeSeries VALUES ( 5, 3 , '2014-01-27 12:45' , -8.76 ); 
INSERT INTO TimeSeries VALUES ( 5, 3 , '2014-01-27 13:00' , -9.32 ); 
INSERT INTO TimeSeries VALUES ( 5, 4 , '2014-01-27 12:15' , 0.298 ); 
INSERT INTO TimeSeries VALUES ( 5, 4 , '2014-01-27 12:30' , 0.32  ); 
INSERT INTO TimeSeries VALUES ( 5, 4 , '2014-01-27 12:45' , 0.317 ); 
INSERT INTO TimeSeries VALUES ( 5, 4 , '2014-01-27 13:00' , 0.302 ); 

-- Just to make sure the query works 
INSERT INTO TimeSeries VALUES ( 5, 89 , '2014-01-27 18:30' , 10  ); 
INSERT INTO TimeSeries VALUES ( 5, 89 , '2014-01-27 19:00' , -273.2 ); -- this is not a contiguous value 

查詢:

WITH Sequences AS (
    SELECT 
    T.*, 
    ROW_NUMBER() OVER (PARTITION BY SiteId, VariableId, Value ORDER BY DateTime) AS RNO, 
    ROW_NUMBER() OVER (ORDER BY SiteId, VariableId, DateTime) AS RNE 
    FROM 
    TimeSeries T 
) 
SELECT 
    S.SiteId, 
    S.VariableId, 
    S.Value, 
    MIN(S.DateTime) AS [Start], 
    MAX(S.DateTime) AS [End], 
    COUNT(*) AS ValueCount 
FROM 
    Sequences S 
GROUP BY 
    S.SiteId, 
    S.VariableId, 
    S.Value, 
    S.RNE - S.RNO 
HAVING 
    COUNT(*) > 1 

Results

| SITEID | VARIABLEID | VALUE |       START |       END | VALUECOUNT | 
|--------|------------|--------|--------------------------------|--------------------------------|------------| 
|  5 |   87 | -273.2 | January, 27 2014 12:15:00+0000 | January, 27 2014 13:00:00+0000 |   4 | 
|  5 |   88 | -273.2 | January, 27 2014 12:15:00+0000 | January, 27 2014 13:00:00+0000 |   4 | 
|  5 |   89 | -273.2 | January, 27 2014 12:15:00+0000 | January, 27 2014 13:00:00+0000 |   4 | 

你可以看到,只有4條記錄爲VariableId = 89發現(因爲最後2條記錄,我添加不應該被認爲)。基於this SO answerthis blog post

+1

這是如此完美。謝謝!希望別人能夠找到它,而不是像我一樣花時間搜索。我知道我會回來參考。 – user3229811

+0

嘿,很高興我能夠幫助! :-) – rsenna