2011-10-26 48 views
1

我正在嘗試構建一個查詢來分析我們的時間跟蹤系統中的數據。每次用戶滑入或滑出時,都會記錄滑動時間和On或Off站點(進入或退出)。在用戶'Joe Bloggs'的情況下,有4行,我想要配對並計算Joe Bloggs在網站上花費的總時間。SQL Server在不同行之間找到datediff,總和

問題是有些記錄不容易配對。在給出的例子中,第二個用戶有兩個連續的'on',我需要找到一個方法來忽略重複的'on'或'off'行。

ID | Time     |OnOffSite| UserName 
------------------------------------------------------ 
123 | 2011-10-25 09:00:00.000 | on  | Bloggs Joe | 
124 | 2011-10-25 12:00:00.000 | off  | Bloggs Joe | 
125 | 2011-10-25 13:00:00.000 | on  | Bloggs Joe | 
126 | 2011-10-25 17:00:00.000 | off  | Bloggs Joe | 
127 | 2011-10-25 09:00:00.000 | on  | Jonesy Ian | 
128 | 2011-10-25 10:00:00.000 | on  | Jonesy Ian | 
129 | 2011-10-25 11:00:00.000 | off  | Jonesy Ian | 
130 | 2011-10-25 12:00:00.000 | on  | Jonesy Ian | 
131 | 2011-10-25 15:00:00.000 | off  | Jonesy Ian | 

我的系統是MS SQL 2005.查詢的報告期限爲每月。

任何人都可以提出解決方案嗎?我的數據已按照用戶名和時間分組在一張表中,ID字段爲Identity。

+2

對於瓊西伊恩,您希望放棄哪個'on'? –

+0

'每次用戶滑入'你的意思是'每次用戶認證'? – npclaudiu

+0

我想放棄第二個'開',是的通過刷卡我的意思是驗證。感謝迄今爲止的答案:)我今天會嘗試測試。 – MarcKirby

回答

3
-- ===================== 
-- sample data 
-- ===================== 
declare @t table 
(
    ID int, 
    Time datetime, 
    OnOffSite varchar(3), 
    UserName varchar(50) 
) 

insert into @t values(123, '2011-10-25 09:00:00.000', 'on', 'Bloggs Joe') 
insert into @t values(124, '2011-10-25 12:00:00.000', 'off', 'Bloggs Joe') 
insert into @t values(125, '2011-10-25 13:00:00.000', 'on', 'Bloggs Joe') 
insert into @t values(126, '2011-10-25 17:00:00.000', 'off', 'Bloggs Joe') 
insert into @t values(127, '2011-10-25 09:00:00.000', 'on', 'Jonesy Ian') 
insert into @t values(128, '2011-10-25 10:00:00.000', 'on', 'Jonesy Ian') 
insert into @t values(129, '2011-10-25 11:00:00.000', 'off', 'Jonesy Ian') 
insert into @t values(130, '2011-10-25 12:00:00.000', 'on', 'Jonesy Ian') 
insert into @t values(131, '2011-10-25 15:00:00.000', 'off', 'Jonesy Ian') 

-- ===================== 
-- solution 
-- ===================== 
select 
    UserName, timeon, timeoff, diffinhours = DATEDIFF(hh, timeon, timeoff) 
from 
(
    select 
     UserName, 
     timeon = max(case when k = 2 and OnOffSite = 'on' then Time end), 
     timeoff = max(case when k = 1 and OnOffSite = 'off' then Time end) 
    from 
    (
     select 
      ID, 
      UserName, 
      OnOffSite, 
      Time, 
      rn = ROW_NUMBER() over(partition by username order by id) 
     from 
     (
      select 
       ID, 
       UserName, 
       OnOffSite, 
       Time, 
       rn2 = case OnOffSite 
       -- '(..order by id)' takes earliest 'on' in the sequence of 'on's 
       -- to take the latest use '(...order by id desc)' 
       when 'on' then 
        ROW_NUMBER() over(partition by UserName, OnOffSite, rn1 order by id) 
       -- '(... order by id desc)' takes the latest 'off' in the sequence of 'off's 
       -- to take the earliest use '(...order by id)' 
       when 'off' then 
        ROW_NUMBER() over(partition by UserName, OnOffSite, rn1 order by id desc) 
       end, 
       rn1 
      from 
      (
       select 
        *, 
        rn1 = ROW_NUMBER() over(partition by username order by id) + 
         ROW_NUMBER() over(partition by username, onoffsite order by id desc) 
       from @t 
      ) t 
     ) t 
     where rn2 = 1 
    ) t1 
    cross join 
    (
     select k = 1 union select k = 2 
    ) t2 
    group by UserName, rn + k 
) t 
where timeon is not null or timeoff is not null 
order by username 
+0

這個答案是正確的,並且與我的數據一起工作得很好。有一個T-SQL Master,他的名字是Alexey!非常感謝。 – MarcKirby

+0

+1。我的解決方案几乎是一樣的。我從一開始就一直使用排名,我可以看到,在某個時候,您也將中間分組更改爲排名。唯一的區別就是我如何獲得'timeon'和'timeoff':我使用了一個自連接,在這種情況下我認爲這比在你的答案中使用的'max(case ...)'更糟糕。無論如何,這個工作很好,所以......做得好! :) –

0

首先,您需要與業務方談判並決定一組匹配規則。

之後,我建議你添加一個狀態字段到你記錄每行的狀態(匹配,不匹配,刪除等)的表中。無論何時添加一行,您都應該嘗試將其匹配成一對。成功的匹配將兩行的狀態設置爲匹配,否則新行將無法匹配。