2016-12-14 28 views
2

我正試圖找到一種方法來完成此操作而不使用遊標。這是一個替代光標的方法嗎?

我有一組數據總計約1300萬條記錄。一個記錄與下一個記錄之間的間隔不同,但都在5到20分鐘之間。 我需要創建一個新的數據表,但要選擇數據,以便在一條記錄和下一條記錄之間至少有30分鐘的間隔。

舉例來說,如果我有這樣的:

VID | Datetime 
1 | 2016-01-01 00:00 
1 | 2016-01-01 00:10 
1 | 2016-01-01 00:12 
1 | 2016-01-01 00:25 
2 | 2016-01-01 00:40 
4 | 2016-01-01 01:00 
4 | 2016-01-01 02:13 
6 | 2016-01-01 02:23 
7 | 2016-01-01 02:25 
8 | 2016-01-01 02:49 
9 | 2016-01-01 02:59 
9 | 2016-01-01 03:01 
9 | 2016-01-01 03:09 
9 | 2016-01-01 03:24 
9 | 2016-01-01 04:05 

新表是這樣的:

VID | Datetime 
1 | 2016-01-01 00:00 
2 | 2016-01-01 00:40 
4 | 2016-01-01 02:13 
8 | 2016-01-01 02:49 
9 | 2016-01-01 03:24 

我可以用光標做到這一點,但對於數以百萬計的這瘋狂的記錄。我已經看到類似的情況提到了一些奇怪的更新,但我不確定那是什麼。

當前使用SQL Server 2014.任何幫助將不勝感激。

+1

古怪的更新有要求非常具體名單爲它才能正常工作。這也是無證的行爲,因此它可能會或可能不總是爲你工作。然而,傑夫·莫德在這個話題上有一篇很棒的文章。 http://www.sqlservercentral.com/articles/T-SQL/68467/缺點是我不認爲這是你想要的。看起來你想在每個30分鐘窗口中獲得最低的日期時間值。你可以在這裏用分區使用ROW_NUMBER。如果沒有人提供答案,我會在會議結束後嘗試工作。 –

+0

@SeanLange:謝謝! – user3150002

+1

這樣的時代我真的很希望我的工作能夠在一起,並在2008年取得成功。我感到領先/滯後的答案,但無法在工作中進行測試。 –

回答

1

如果你有一個標識符列而不是[vid],情況會好很多。 如果是這樣的話,那麼你可以這樣做:

with mycte (id, mydate, keepthis, offset) 
as 
(
    select 
     id, 
     mydate, 
     1 keepthis, 
     0 offset 
    from mytable where id = 1 
    union all 
    select 
     t.id, 
     t.mydate, 
     case when datediff(mi, o.mydate, t.mydate)+o.offset >= 30 then 1 else 0 end keepthis, 
     case when datediff(mi, o.mydate, t.mydate)+o.offset >= 30 then 0 else datediff(mi, o.mydate, t.mydate)+o.offset end 
    from mytable t join mycte o on t.id = o.id+1 
) 

select id,mydate from mycte where keepthis=1 
+0

我可以使用ROW_NUMBER()添加一個id。記錄按日期時間順序排序。我會試試這個。 – user3150002

+0

^這不幸似乎不工作 – user3150002

+0

它dosn't工作? –

1

條條框框思考了一下,如果它是允許的,只是讓每個30分鐘時間最早的行,然後我有一個可行的解決方案。

注意事項:

  • 我認爲在這一點上你的數據可能會去追溯到30年。 1300萬間隔接近配置表的配置方式的極限,所以如果你超過16密爾,你需要進行修改
  • 您可能需要將CTE分解爲臨時表並添加索引以獲得良好的性能數據:-)

-

--setup data 
declare @t table (VID int, [Datetime] datetime); 
insert @t values 
     (1, '1986-01-01 00:00'), --very early year 
     (1, '2016-01-01 00:10'), 
     (1, '2016-01-01 00:12'), 
     (1, '2016-01-01 00:25'), 
     (2, '2016-01-01 00:40'), 
     (4, '2016-01-01 01:00'), 
     (4, '2016-01-01 02:13'), 
     (6, '2016-01-01 02:23'), 
     (7, '2016-01-01 02:25'), 
     (8, '2016-01-01 02:49'), 
     (9, '2016-01-01 02:59'), 
     (9, '2016-01-01 03:01'), 
     (9, '2016-01-01 03:09'), 
     (9, '2016-01-01 03:24'), 
     (9, '2016-01-01 04:05'); 
select * from @t order by VID, [Datetime]; 
--select datediff(MI, (select min([Datetime]) from @t), (select max([Datetime]) from @t)); --15778325 records in 30 years - handled by t4 x t4 x t4 in tally generator 

-- Tally generator courtesy of http://www.sqlservercentral.com/blogs/never_say_never/2010/03/19/tally_2D00_table_2D00_cte/ 
-- Tally Table CTE script (SQL 2005+ only) 
-- You can use this to create many different numbers of rows... for example: 
-- You could use a 3 way cross join (t3 x, t3 y, t3 z) instead of just 2 way to generate a different number of rows. 
-- The # of rows this would generate for each is noted in the X3 comment column below. 
-- For most common usage, I find t3 or t4 to be enough, so that is what is coded here. 
-- If you use t3 in ‘Tally’, you can delete t4 and t5. 
; WITH 
    -- Tally table Gen   Tally Rows:  X2    X3 
t1 AS (SELECT 1 N UNION ALL SELECT 1 N), -- 4   , 8 
t2 AS (SELECT 1 N FROM t1 x, t1 y),   -- 16   , 64 
t3 AS (SELECT 1 N FROM t2 x, t2 y),   -- 256   , 4096 
t4 AS (SELECT 1 N FROM t3 x, t3 y),   -- 65536  , 16,777,216 
t5 AS (SELECT 1 N FROM t4 x, t4 y),   -- 4,294,967,296, A lot 
Tally AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) N 
      FROM t4 x, t4 y, t4 z), -- Change the t3's to one of the other numbers above for more/less rows 
--generate time values 
Intervals as (
     select t.N - 1 interval, 
       dateadd(mi, (t.N - 1) * 30, min_date.min_date) interval_start, 
       dateadd(mi, (t.N) * 30, min_date.min_date) next_interval_start 
     from (
       select min([Datetime]) min_date 
       from @t 
       ) min_date 
     join Tally t 
       on t.N <= datediff(MI, (select min([Datetime]) from @t), (select max([Datetime]) from @t))/30 + 1 
), 
--join intervals to data tables 
Intervaled_data as (
     select *, row_number() over (partition by i.interval order by t.[Datetime]) row_num 
     from @t t 
     join Intervals i 
       on t.[Datetime] >= i.interval_start and t.[Datetime] < i.next_interval_start 
) 
select i.VID, i.[Datetime] 
from Intervaled_data i 
where i.row_num = 1 
order by i.VID, i.[Datetime]; 
+0

嗯..有趣。讓我測試一下。 – user3150002

相關問題