2010-10-18 77 views
3

我需要SQl服務器2005(SQL服務器管理工​​作室快遞)的查詢。 我有數據存儲爲1分鐘的時間範圍(每行1分鐘),每個表的列是ID,符號,日期時間,開放,高,低,關閉,音量。 我需要轉換(壓縮)到每個可能的多個時間幀,所以我們假設10分鐘,13,15等等。 提供詳細信息,如果有人可以幫忙。 謝謝 AlbertoSQL Server 2005中的數據聚合

+0

這不就是一個GROUP BY子句嗎? – leppie 2010-10-18 12:32:40

+0

我的歉意,如果這個問題超出了這個組的規則,我會刪除我的請求。 – 2010-10-18 12:39:36

+2

我並沒有真正明白你的意思,通過壓縮「到每一個可能的多個時間框架,所以讓我們說10分鐘,13,15等等」。你能提供示例數據和期望的結果嗎? – 2010-10-18 12:52:12

回答

1
;WITH cte AS 
(SELECT *, 
     (32 * CAST([DATETIME] AS INT)) + DATEPART(HOUR,[DATETIME]) + (DATEPART(MINUTE,[DATETIME])/15)/4.0 AS Seg 
    FROM  prices 
    ) 
,cte1 AS 
(
SELECT *, 
     ROW_NUMBER() OVER (PARTITION BY Symbol,Seg ORDER BY [DATETIME])  AS RN_ASC , 
     ROW_NUMBER() OVER (PARTITION BY Symbol,Seg ORDER BY [DATETIME] DESC) AS RN_DESC 
FROM cte 
)  
SELECT 
     Symbol, 
     Seg, 
     MAX(CASE WHEN RN_ASC=1 THEN [DATETIME] END) AS OpenDateTime, 
     MAX(CASE WHEN RN_ASC=1 THEN [OPEN] END) AS [OPEN], 
     MAX(High) High, 
     MIN(Low) Low, 
     SUM(Volume) Volume, 
     MAX(CASE WHEN RN_DESC=1 THEN [CLOSE] END) AS [CLOSE], 
     MAX(CASE WHEN RN_DESC=1 THEN [DATETIME] END) AS CloseDateTime 
FROM cte1 
GROUP BY Symbol,Seg 
ORDER BY OpenDateTime 

或另一種方法可能是值得的測試,看看它是否是任何更快。

DECLARE @D1 DATETIME 
DECLARE @D2 DATETIME 
DECLARE @Interval FLOAT 

SET @D1 = '2010-10-18 09:00:00.000' 
SET @D2 = '2010-10-19 18:00:00.000' 
SET @Interval = 15 

;WITH 
L0 AS (SELECT 1 AS c UNION ALL SELECT 1), 
L1 AS (SELECT 1 AS c FROM L0 A CROSS JOIN L0 B), 
L2 AS (SELECT 1 AS c FROM L1 A CROSS JOIN L1 B), 
L3 AS (SELECT 1 AS c FROM L2 A CROSS JOIN L2 B), 
L4 AS (SELECT 1 AS c FROM L3 A CROSS JOIN L3 B), 
Nums AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS i FROM L4), 
Ranges AS(
SELECT 
     DATEADD(MINUTE,@Interval*(i-1),@D1) AS StartRange, 
     DATEADD(MINUTE,@Interval*i,@D1) AS NextRange 
FROM Nums where i <= 1+CEILING(DATEDIFF(MINUTE,@D1,@D2)/@Interval)) 
,cte AS (
SELECT 
    * 
    ,ROW_NUMBER() OVER (PARTITION BY Symbol,r.StartRange ORDER BY [DateTime])  AS RN_ASC 
    ,ROW_NUMBER() OVER (PARTITION BY Symbol,r.StartRange ORDER BY [DateTime] DESC) AS RN_DESC 
FROM Ranges r 
JOIN prices p ON p.[DateTime] >= r.StartRange and p.[DateTime] < r.NextRange) 
SELECT 
     Symbol, 
     MAX(CASE WHEN RN_ASC=1 THEN [DateTime] END) AS OpenDateTime, 
     MAX(CASE WHEN RN_ASC=1 THEN [Open] END) AS [Open], 
     MAX(High) High, 
     MIN(Low) Low, 
     SUM(Volume) Volume, 
     MAX(CASE WHEN RN_DESC=1 THEN [Close] END) AS [Close], 
     MAX(CASE WHEN RN_DESC=1 THEN [DateTime] END) AS CloseDateTime 
FROM cte 
GROUP BY Symbol,StartRange 
ORDER BY OpenDateTime 
+0

感謝馬丁,但我得到錯誤:也許我可以提供一些示例xls數據從我的SQL數據庫,我可以想象這是很難正確編碼沒有數據的查詢。我可以在這裏附上檔案嗎?否則acepsut是我的Skype暱稱以及我的gmail.com帳戶 – 2010-10-18 14:55:55

+0

你會得到什麼錯誤? (如果你想把數據放在某個地方,Google電子表格可能是個好地方?) – 2010-10-18 14:58:23

+0

@Alberto - 這是在smirkingman的回答下面的評論更新。 – 2010-10-19 11:28:56

3

Alberto,看起來您需要SQL語句中的「Group By」子句(如Leppie所述)。所以,你應該更好地尋找它。

首先,您應該使用開始日期/結束日期/時間篩選要聚合的行,然後按照提及的條款對它們進行分組。

這裏是第一個link當我搜索「sql group by」關鍵字通過谷歌。

1

不是簡單的「分組依據」 - 打開和關閉值需要爲組中的第一行和相應的最後一行。或者說,至少這樣是對外匯數據:)

+0

是的,它不是一個簡單的組。 – 2010-10-18 13:28:42

0

將採用存儲過程:首先提取MIN(日期時間),更漂亮,但這裏有一個素描:

WITH quarters(q) AS (
    SELECT DISTINCT 
     15*CAST(DATEDIFF("n",'2000/01/01',dataora)/15 as Int) AS primo 
    FROM 
     Prezzi 
) 
SELECT 
    simbolo, DATEADD("n",q,'2000/01/01') AS tick, 
     MIN(minimo) AS minimo, MAX(massimo) AS massimo, 
     (SELECT 
      TOP 1 apertura FROM Prezzi P 
     WHERE 
      P.simbolo = simbolo AND 
      P.dataora >= DATEADD("n",q,'2000/01/01') 
     ORDER BY 
      P.dataora ASC 
     ) as primaapertura, 
     (SELECT 
      TOP 1 chiusura FROM Prezzi P 
     WHERE 
      P.simbolo = simbolo AND 
      P.dataora < DATEADD("s",14*60+59,DATEADD("n",q,'2000/01/01')) 
     ORDER BY 
      P.dataora DESC 
     ) as ultimachiusara, 
     SUM(volume)/COUNT(*) AS volumemedio 
FROM 
    quarters INNER JOIN Prezzi 
    ON dataora BETWEEN DATEADD("n",q,'2000/01/01') 
     AND DATEADD("s",14*60+59,DATEADD("n",q,'2000/01/01')) 
GROUP BY 
    simbolo, DATEADD("n",q,'2000/01/01') 
ORDER BY 
    1, 2 

WITH子句得到的15個分鐘的間隔列表,向下取整,在數據集(讓我們假設2000年前沒有)。 然後使用這些間隔按14:59間隔分組。 對於音量,您必須決定是要平均還是總計。

語法可能不太好,但你應該明白。

編輯:調整MIN(打開),MIN(關閉)接第一和最後。實際上,這並不會有太大變化,因爲開放和關閉的概念取決於知道報價產生的交易所與計算機收集數據的時間之間的時間差異。

另外,除非OP有來自所有交易所的實時信息的特權,否則所有報價都會延遲20分鐘。

EDIT(2):完全正確,FIRST和LAST是從我的IBM天> ;-)

解結轉現在使用具有ASC/DESC TOP的間隔內選擇第一個和最後報價。

+0

你爲什麼要得到'MIN(open)'和'MIN(close) ? OP需要與每段的第一條記錄相關的「開放」價格和與每段最後一條記錄相關的「關閉」價格。 – 2010-10-18 15:29:37

+0

懶惰。他的數據來自價格反饋。開盤和收盤在白天不能改變,它們是股票價格今天早上的第一件事,昨晚是最後一件事。 – smirkingman 2010-10-18 15:40:53

+0

這確實有道理,但這不是OP在註釋中定義它的方式。 「關閉(在此時間段內的最後價格)」@Alberto - 你能澄清一下嗎? – 2010-10-18 15:51:07

0
Declare @tbl1MinENI Table 
    (ID int identity, 
    Simbolo char(3), 
    DataOra datetime, 
    Apertura numeric(15,4), 
    Massimo numeric(15,4), 
    Minimo numeric(15,4), 
    Chiusura numeric(15,4), 
    Volume int) 

    Insert Into @tbl1MinENI ( Simbolo, DataOra, Apertura, Massimo, Minimo, Chiusura, Volume) 
    Values 
    ('ENI', '2010/10/18 09:00:00', 16.1100, 16.1800, 16.1100, 16.1400, 244015), 
    ('ENI', '2010/10/18 09:01:00', 16.1400, 16.1400, 16.1300, 16.1400, 15692), 
    ('ENI', '2010/10/18 09:02:00', 16.1400, 16.1500, 16.1400, 16.1500, 147035), 
    ('ENI', '2010/10/18 09:03:00', 16.1500, 16.1600, 16.1500, 16.1600, 5181 ), 
    ('ENI', '2010/10/18 09:04:00', 16.1600, 16.2000, 16.1600, 16.1900, 5134 ), 
    ('ENI', '2010/10/18 09:05:00', 16.1900, 16.1900, 16.1800, 16.1800, 15040), 
    ('ENI', '2010/10/18 09:06:00', 16.1900, 16.1900, 16.1600, 16.1600, 68867), 
    ('ENI', '2010/10/18 09:07:00', 16.1600, 16.1600, 16.1600, 16.1600, 7606 ), 
    ('ENI', '2010/10/18 09:08:00', 16.1500, 16.1500, 16.1500, 16.1500, 725 ), 
    ('ENI', '2010/10/18 09:09:00', 16.1600, 16.1600, 16.1600, 16.1600, 81 ), 
    ('ENI', '2010/10/18 09:10:00', 16.1700, 16.1800, 16.1700, 16.1700, 68594), 
    ('ENI', '2010/10/18 09:11:00', 16.1800, 16.1800, 16.1800, 16.1800, 6619 ) 

    Declare @nRowsPerGroup int = 3 

;With Prepare as 
(
Select datediff(minute, '2010/10/18 09:00:00', DataOra)/@nRowsPerGroup as Grp, 
     Row_Number() over (partition by datediff(minute, '2010/10/18 09:00:00', DataOra)/@nRowsPerGroup order by dataora) as rn, 
     * 
    From tbl1MinENI  
), b as 
(
Select a.Grp, 
     Min(a.DataOra)   as GroupDataOra, 
     Min(ID) AperturaID, 
     max(a.Massimo)   as Massimo, 
     Min(a.Minimo)   as Minimo, 
     max(id) ChiusuraID, 
     sum(a.Volume)   as Volume 
    From Prepare a 
    Group by Grp 
) 
Select b.grp, 
     b.GroupDataOra, 
     ta.Apertura, 
     b.Massimo, 
     b.Minimo, 
     tc.Chiusura, 
     b.Volume 
From b 
Inner Join tbl1MinENI ta on ta.ID=b.AperturaID 
Inner Join tbl1MinENI tc on tc.ID=b.ChiusuraID 
; 
+0

謝謝尼古拉,我得到一些錯誤1)消息102級15行13錯誤的sintax','2)信息139級15行不可能將預定義的值分配給局部變量3)消息137級別15行30聲明標量值「@nRowsPerGroup」。 – 2010-10-19 12:39:41

+0

如果缺少09:00:00的記錄,請執行此操作(請參閱前面的註釋) – smirkingman 2010-10-19 13:33:58

+0

它有效。只需刪除09:00:00的插入並嘗試。事實上,你可以刪除任何數量的行,你想它仍然會工作。 – Niikola 2010-10-26 16:45:17