2010-02-22 52 views
2

我有一個表列id和EmployeeID。表中的數據有如下特點:在某些地方(其中ID是連續的),相同的僱員有時可發現於,例如幫我看看數據塊

Id | EmployeeID 
--------------- 
1 |  1 
2 |  1 
3 |  2 
4 |  5 
5 |  1 
6 |  1 

我想建立一個查詢,以找到包含相同數據的塊EmployeeID,其中Id是連續的(x記錄的最小值)。到目前爲止,我想出了:

SELECT EmployeeID, MIN(Id), MAX(Id), COUNT(*) 
FROM recs 
GROUP BY EmployeeID 
HAVING COUNT(*) > 5 AND 
     MAX(Id) - MIN(Id) + 1 = COUNT(*) 

這個查詢會帶給我一些數據(但不是全部)模塊,只要在同一員工只能在一個塊中找到。任何人都可以想出一個解決方案,爲每個員工提供所有不同的數據塊?

回答

1

不是最好的解決方案,但它應該工作(例如,3點連續的ID):

SELECT Id, EmployeeID FROM 
(
SELECT r.Id, r.EmployeeID, 
(SELECT COUNT(1) FROM recs r1 WHERE (r1.EmployeeID = r.EmployeeID AND r1.id = r.Id-1) AS c1, 
(SELECT COUNT(1) FROM recs r2 WHERE (r2.EmployeeID = r.EmployeeID AND r2.id = r.Id-2) AS c2, 
(SELECT COUNT(1) FROM recs r3 WHERE (r3.EmployeeID = r.EmployeeID AND r3.id = r.Id-3) AS c3 
FROM recs r1) tab1 
WHERE (tab1.c1+tab1.c2+tab1.c3 =3); 

我建議Id是主(或唯一的)鍵。如果不是這樣,你應該將每個子查詢改爲SELECT IF(COUNT(1)> 0,1,0).....

2

加入到同一個表中table1.Id = table2.Id + 1和table1.employeeid = table2.employeeid

+0

這是第一步,但我仍然需要獲得至少有5個連續ID的數據塊。您的解決方案將獲取所有連續的行。 – Anax 2010-02-23 00:47:40

0

爲此使用臨時表。使用此解決方案:

SELECT EmployeeID, MIN(Id) AS Min, MAX(Id) AS Max, COUNT(*) AS Count 
INTO #TempTable 
FROM recs 
GROUP BY EmployeeID 

SELECT * FROM #TempTable WHERE 
Count > 5 AND 
     Max - Min + 1 = Count 

EDITED ANSWER

請試試這個:

SELECT * FROM( 
SELECT EmployeeID, MIN(Id) AS min, MAX(Id) AS max, COUNT(*) AS count 
    FROM recs 
    GROUP BY EmployeeID) AS Table 
    WHERE Table.count > 5 AND 
      Table.max - Table.min + 1 = Table.count 
+0

我相信這將與我提供的查詢完全一樣。只有員工出現在一個塊上時,它纔會獲取數據塊。 – Anax 2010-02-23 08:04:20

+0

請參閱編輯答案。 – 2010-02-23 08:30:18

+0

這仍然行不通。嘗試使用提供的數據集(將Table.count> 5替換爲Table.count> = 2)以便自己查看。你仍然以同樣的方式接近這個問題。 – Anax 2010-02-23 12:43:27

0

哇,這是一個真正的謎。我相信這有各種各樣的漏洞,但這裏有一個可能的解決方案。首先我們的測試數據:

If Exists(Select 1 From INFORMATION_SCHEMA.TABLES Where TABLE_NAME = 'recs') 
    DROP TABLE recs 
GO 
Create Table recs 
(
    Id int not null 
    , EmployeeId int not null 
) 
Insert recs(Id, EmployeeId) 
Values (1,1) ,(2,1) ,(3,1) ,(4,2) ,(5,5) ,(6,1) ,(7,1) ,(8,1) ,(10,1) 
    ,(11,1) ,(12,1) ,(13,2) ,(14,2) ,(15,2) ,(16,2) 

接下來,您將需要一個包含數字序列的Tally或Numbers表。我只在這個中放了500個元素,但考慮到您可能需要更多的數據大小。 Tally表中最大的數字應該大於recs表中的最大數字。

Create Table dbo.Tally(Num int not null) 
GO 
;With Numbers As 
    (
    Select ROW_NUMBER() OVER (ORDER BY s1.object_id) As Num 
    From sys.columns as s1 
    ) 
Insert dbo.Tally(Num) 
Select Num 
From Numbers 
Where Num < 500 

現在爲實際的解決方案。基本上,我用一系列CTE來推斷連續序列的開始和結束點。

; With 
    Employees As 
    (
    Select Distinct EmployeeId 
    From dbo.Recs 
    ) 
    , SequenceGaps As 
    (
    Select E.EmployeeId, T.Num, R1.Id 
    From dbo.Tally As T 
     Cross Join Employees As E 
     Left Join dbo.recs As R1 
      On R1.EmployeeId = E.EmployeeId 
       And R1.Id = T.Num 
    Where T.Num <= ( 
     Select Max(R3.Id) 
     From dbo.Recs As R3 
      Where R3.EmployeeId = E.EmployeeId 
      ) 
    ) 
    , EndIds As 
    (
    Select S.EmployeeId 
     , Case When S1.Id Is Null Then S.Id End As [End] 
    From SequenceGaps As S 
     Join SequenceGaps As S1 
      On S1.EmployeeId = S.EmployeeId 
       And S1.Num = (S.Num + 1) 
    Where S.Id Is Not Null 
     And S1.Id Is Null 
    Union All 
    Select S.EmployeeId, Max(Id) 
    From SequenceGaps As S 
    Where S.Id Is Not Null 
    Group By S.EmployeeId 
    ) 
    , SequencedEndIds As 
    (
    Select EmployeeId, [End] 
     , ROW_NUMBER() OVER (PARTITION BY EmployeeId ORDER BY [End]) As SequenceNum 
    From EndIds 
    ) 
    , StartIds As 
    (
    Select S.EmployeeId 
     , Case When S1.Id Is Null Then S.Id End As [Start] 
    From SequenceGaps As S 
     Join SequenceGaps As S1 
      On S1.EmployeeId = S.EmployeeId 
       And S1.Num = (S.Num - 1) 
    Where S.Id Is Not Null 
     And S1.Id Is Null 
    Union All 
    Select S.EmployeeId, 1 
    From SequenceGaps As S 
    Where S.Id = 1 
    ) 
    , SequencedStartIds As 
    (
    Select EmployeeId, [Start] 
     , ROW_NUMBER() OVER (PARTITION BY EmployeeId ORDER BY [Start]) As SequenceNum 
    From StartIds 
    ) 
    , SequenceRanges As 
    (
    Select S1.EmployeeId, Start, [End] 
    From SequencedStartIds As S1 
     Join SequencedEndIds As S2 
      On S2.EmployeeId = S1.EmployeeId 
       And S2.SequenceNum = S1.SequenceNum 
    ) 
Select * 
From SequenceGaps As SG 
Where Exists(
     Select 1 
     From SequenceRanges As SR 
     Where SR.EmployeeId = SG.EmployeeId 
      And SG.Id Between SR.Start And SR.[End] 
      And (SR.[End] - SR.[Start] + 1) >= @SequenceSize 
     ) 

WHERE子句和@SequenceSize在使用最後陳述時,你可以控制哪些返回序列。