2010-10-28 24 views
1

我有一個表像這樣最大複製

DateTime start_time not null, 
DateTime end_time not null, 
Status_Id int not null, 
Entry_Id int not null 

我希望得到一個時間段,其中只有最後開始的有效期爲一個給定的entry_id內各狀態下的計數。

我現在使用的就是這個(動態日期):

with c (Status_Id, Entry_Id, Start_Date) AS (
    select Status_Id, Entry_Id, Start_Date from tbl where 
    (End_Date BETWEEN '19000101' AND '21000101') 
    AND ((Start_Date BETWEEN '19000101' AND '21000101') 
    OR End_Date <= '21000101')) 
select Status_Id, count(*) as cnt from 
(select Entry_Id, max(start_date) as start_date from c 
    group by Entry_Id) d inner join 
c on c.Entry_Id = d.Entry_Id 
and c.start_date = d.start_date 
GROUP BY Status_Id WITH ROLLUP 

的問題是,它的時候有一些entry_id具有多個條目的相同的起始日期計算錯誤。 (我並不特別在意選擇在這種情況下,其地位,就只有1選擇)

一些測試數據:

status_id Entry_id Start_date 
496 45173 2010-09-29 18:04:33.000 
490 45173 2010-09-29 18:48:20.100 
495 45173 2010-09-29 19:25:29.300 
489 45174 2010-09-29 18:43:01.500 
493 45175 2010-09-29 18:48:00.500 
493 45175 2010-09-29 21:16:02.700 
489 45175 2010-09-30 17:52:12.100 
493 45176 2010-09-29 17:55:21.300 
492 45176 2010-09-29 18:20:52.200 <------ This is the one that gives the problems 
493 45176 2010-09-29 18:20:52.200 <------ This is the one that gives the problems 

結果應該是

495 1 
489 2 
492 1 (or 493 1) 

回答

1

阿爾特原生答案基於OP的可愛評論。只需要

WITH 
    [sequenced_data] 
AS 
(
    SELECT 
    *, 
    ROW_NUMBER() OVER (PARTITION BY entry_id ORDER BY start_time DESC, status_id DESC) AS [sequence_id] 
    FROM 
    tbl 
    WHERE 
    start_time < '21:00' AND end_time > '19:00' 
) 
SELECT status_id, COUNT(*) 
FROM [sequenced_data] 
WHERE sequence_id = 1 
GROUP BY status_id 

的ROW_NUMBER()函數,其中沒有單個字段可以唯一識別individul記錄。在數據中存在唯一標識列的情況下,可以編寫其他查詢。但是,SQL Server在優化如上所述的ROW_NUMBER()查詢方面非常有效,它應該(假設相關索引)有效。

編輯

剛纔有人向我建議人們不喜歡長的代碼,他們更喜歡緊湊的代碼。所以CTE版本已替換爲內嵌版本(CTE的真的只是幫擊穿解釋原因的查詢,並在必要時編輯歷史)...

編輯

ROW_NUMBER()無法形成OP所發現的WHERE子句的一部分。通過重新放置一個CTE來更新查詢。

+0

哦,這是一個非常非常好的解決方案! (儘管您的編輯被破壞了,但是行號必須在選擇中)。它的速度也是我提出的速度的兩倍,因爲它不需要重複消除,只是一種特殊的排序 – Cine 2010-10-29 12:24:02

2

如果我正確理解,您想在您的時間段內爲特定狀態計入不同的條目......如果是這樣,則應使用count()中的DISTINCT子句從計數(*)變爲計數(不同的Entry_id)

with c (Status_Id, Entry_Id, Start_Date) AS (
    select Status_Id, Entry_Id, Start_Date from tbl where 
    (End_Date BETWEEN '19000101' AND '21000101') 
    AND ((Start_Date BETWEEN '19000101' AND '21000101') 
    OR End_Date <= '21000101')) 
select Status_Id, count(distinct Entry_Id) as cnt from 
(select Entry_Id, max(start_date) as start_date from c 
    group by Entry_Id) d inner join 
c on c.Entry_Id = d.Entry_Id 
and c.start_date = d.start_date 
GROUP BY Status_Id WITH ROLLUP 

編輯

只要你不關心哪個狀態是返回給定項,我想你可以修改內部查詢返回的第一地位和加入的狀態太

with c (Status_Id, Entry_Id, Start_Date) AS (
    select Status_Id, Entry_Id, Start_Date from tbl where 
    (End_Date BETWEEN '19000101' AND '21000101') 
    AND ((Start_Date BETWEEN '19000101' AND '21000101') 
    OR End_Date <= '21000101')) 
select c.Status_Id, count(c.Entry_Id) as cnt from 
(select Entry_Id, Start_Date, (select top 1 Status_id from c where Entry_Id = CC.Entry_Id and Start_Date = CC.Start_Date) as Status_Id 
    from (select Entry_Id, max(start_date) as start_date from c 
    group by Entry_Id) as CC) d inner join 
c on c.Entry_Id = d.Entry_Id 
and c.start_date = d.start_date 
and c.status_id = d.status_id 
GROUP BY c.Status_Id 

結果

Status_id Count 
489  2 
492  1 
495  1 
+0

如果status_ids是 – Cine 2010-10-28 10:42:23

+0

你需要計算任何不同Entry_Id只是一個時間的重複同樣的這一個只會工作? – 2010-10-28 10:56:12

+0

如果有一個entry_id具有多個具有相同max(start_time)的條目,那麼它應該只包含一次。不管status_id是什麼。如果重複日期不同,你的將包括它。status_id – Cine 2010-10-28 15:47:15

0

我找到了解決自己:

with c (Status_Id, Entry_Id, Start_Date) AS (
    select Status_Id, Entry_Id, Start_Date from tbl where 
    (End_Date BETWEEN '19000101' AND '21000101') 
    AND ((Start_Date BETWEEN '19000101' AND '21000101') 
    OR End_Date <= '21000101')) 
select Status_Id, count(*) as cnt from 
(select max(Status_Id) as Status_Id, c.Entry_Id from --<--- ADDED 
(select Entry_Id, max(start_date) as start_date from c 
    group by Entry_Id) d inner join 
c on c.Entry_Id = d.Entry_Id 
and c.start_date = d.start_date 
group by c.Entry_Id) y --<--- ADDED 
GROUP BY Status_Id WITH ROLLUP