2014-08-28 74 views
0

我有一個(的Sybase)表具有以下信息:SQL:聚集體尺寸爲箱

order_id int 
timestamp datetime 
action  char(1)  --i=inserted, c=corrected, r=removed 
shares  int 

它跟蹤與系統中的順序(由它的order_id標識)相關的股票。使用 一個例子,一個訂單的壽命如下:

timestamp action shares  
    10:00:00 i  1000  -- initial Insert  
    10:06:30 c  900  -- one Change  
    10:07:12 c  800  
    10:50:20 r  800  -- Removal  
    11:10:10 i  600  -- 2nd Insert  
    11:12:10 r  600 

在上面的例子中,順序是從活性和10:00:00十時五十分20秒,並再次從11:10: 10和11:12:10

我在系統中有1000個這樣的訂單,我需要用直方圖進行繪圖,在時間序列中有多少股活動分爲5分鐘的桶/桶。 如果給定訂單的股票數量在相同的中間代碼行裏不止一次發生變化,我需要對這些股票進行平均;如在10上面的例子:05-10:10 bin其中1000,900和800可被avg'd出爲900。

這裏的一個更復雜的例子:

1, "20140828 10:00:00", "i", 1000 
1, "20140828 10:06:00", "c", 900 
1, "20140828 10:07:12", "c", 500 
1, "20140828 10:10:10", "c", 400 
1, "20140828 10:20:20", "r", 400 
1, "20140828 10:30:10", "i", 300 
1, "20140828 10:32:10", "r", 300 

2, "20140828 09:51:00", "i", 500 
2, "20140828 10:08:30", "r", 500 

3, "20140828 10:10:00", "i", 1000 
3, "20140828 10:11:20", "r", 1000 

以其預期輸出:

10:00:00 1500 
10:05:00 1300 
10:10:00 1450 
10:15:00 400 
10:20:00 400 
10:25:00 0 
10:30:00 300 
10:35:00 0 
10:40:00 0 
10:45:00 0 
10:50:00 0 
10:55:00 0 

在此先感謝您的幫助。

+0

您可以發佈您的示例輸入數據的預期輸出結果嗎? – 2014-08-28 03:39:27

+0

感謝您的建議,@JaugarChang。我改變了我的發帖,並添加了更全面的示例和預期輸出。 – 2014-08-28 19:40:19

回答

0

這是關於Running Sum problem in SQL Server(無論是MS或Sybase,由於共同的歷史)的變化,由鬥式ID分組,這可以簡單地從一個基準時間由5 整數分割在分鐘的時間差。因此,像這樣會做:

create table #t(
    BucketNo int not null primary key clustered, 
    Activity int not null, 
    Active  int not null 
); 

-- pre-aggregate activity data 
-- assumes prior existence of a zero-based NUMBERS or TALLY table 
insert #t(BucketNo,Activity,Active) 
select 
    N 
    ,isnull(Activity,0) 
    ,0 
from NUMBERS 
left join (
    select 
     datediff(mm,0,TimeStamp)/5 as BucketNo 
     ,case action when 'i' then +1 
          'r' then -1 
     end * shares   as Activity 
     ,0 as Active 
    from ActivityTable 
    where action <> 'c' 
    group by   datediff(mm,0,TimeStamp)/5 

    union all 

    select 
     datediff(mm,0,TimeStamp)/5 as BucketNo 
     ,case action when 'i' then +1 
          'r' then -1 
     end * shares 
     - ( select top 1 i.shares 
       from ActivityTable i 
       where i.order_id = c.order_id and i.TimeStamp > c.TimeStamp 
       order by i.TimeStamp desc 
      ) as Activity 
     ,0 as Active 
    from ActivityTable as c 
    where c.action = 'c   
    group by   datediff(mm,0,TimeStamp)/5 
) data on data.BucketNo = N 
where N < 24 * 12; -- 5 minute buckets per day 

現在我們使用SQL Server 離奇更新在聚集索引爲了執行運行總和處理#T。

declare @Shares int = 0, 
     @BucketNo int = 0; 

-- `quirky update` peculiar to SQL Server 
update #t 
    set @Shares = Shares 
       = case when BucketNo = @BucketNo 
         then @Shares + Activity 
         else 0 
       end, 
     @BucketNo = BucketNo 
from #t with (TABLOCKX) -- not strictly necessary when using a temp table. 
option (MAXDOP 1);  -- prevent parallelization of query 

select BucketNo, Active from #t order by BucketNo 
go 
+0

感謝Pieter,我正在嘗試你的解決方案(對Sybase進行一些修改),但我有3個問題:1)你在說什麼NUMBERS/TALLY表,2)我得到一個語法錯誤,圍繞'select top 1 i.shares'行解決:'關鍵字'top'附近的語法不正確。 Msg 102,Level 15,State 181',3)將會在Sybase中使用古怪的更新嗎? – 2014-08-28 15:12:31

+0

@ jeromeso:**數字表**:http://dba.stackexchange.com/questions/11506/why-are-numbers-tables-invaluable和http://sqlblog.com/blogs/adam_machanic/archive/2006 /07/12/you-require-a-numbers-table.aspx。是的,古怪的更新應該仍然可以在SYBASE中工作,因爲它預先分離了SQL Server的SYBASE和MS版本。重新語法錯誤,請嘗試** FIRST **而不是** TOP 1 **(按照http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc00801.1510 /html/iqrefso/X315771.htm) – 2014-08-28 20:03:29

+0

玩過SQL解決方案後,我意識到它比我的Perl實現慢,所以我繼續使用Perl。感謝您的建議@PieterGeedkens!我學到了很多。 – 2014-08-29 18:06:10