2011-08-31 55 views
11

我有一個整數的2列的表。第一列表示開始索引,第二列表示結束索引。集團近數

START END 
1  8 
9  13 
14 20 
20 25 
30 42 
42 49 
60 67 

簡單到目前爲止。我想這樣做的是組隨後所有一起記錄:

START END 
1  10 
10 20 

START END 
1  25 
30 49 
60 67 

的記錄可以通過相同的索引先前的結束索引或1保證金開始跟隨

而且

START END 
1  10 
11 20 

將兩者結果

START END 
1  20 

我使用SQL Server 2008 R2。

任何幫助將是巨大的

+4

我覺得這是一個有趣的問題,但你實際所做的任何嘗試自己做了嗎?您嘗試過的查詢? – jadarnel27

+2

你可能有任何重疊的對,如'1,8'和'3,15'? –

+1

Tx爲您的評論馬丁..沒有重疊對。 Jadarnel27 - 我使用sql遊標解決了這個問題,但這個解決方案效率並不高,我正在尋找更優雅更好的解決方案。 –

回答

3

這適用於你的榜樣,讓我知道,如果它不爲其他數據

create table #Range 
(
    [Start] INT, 
    [End] INT 
) 

insert into #Range ([Start], [End]) Values (1, 8) 
insert into #Range ([Start], [End]) Values (9, 13) 
insert into #Range ([Start], [End]) Values (14, 20) 
insert into #Range ([Start], [End]) Values (20, 25) 
insert into #Range ([Start], [End]) Values (30, 42) 
insert into #Range ([Start], [End]) Values (42, 49) 
insert into #Range ([Start], [End]) Values (60, 67) 



;with RangeTable as 
(select 
    t1.[Start], 
    t1.[End], 
    row_number() over (order by t1.[Start]) as [Index] 
from 
    #Range t1 
where t1.Start not in (select 
         [End] 
       from 
        #Range 
        Union 
       select 
        [End] + 1 
       from 
        #Range 
       ) 
) 
select 
    t1.[Start], 
    case 
    when t2.[Start] is null then 
     (select max([End]) 
        from #Range) 
     else 
     (select max([End]) 
        from #Range 
        where t2.[Start] > [End]) 
end as [End]  
from 
    RangeTable t1 
left join 
    RangeTable t2 
on 
    t1.[Index] = t2.[Index]-1 

drop table #Range; 
+0

嗨Aducci, 您的解決方案還可以正常工作其他數據大於示例表中的數據。 –

+0

@Liran Ben Yehuda - 你有沒有把它作爲答案的原因? – Aducci

+0

Tx爲您提供支持。我只是尋找最好的解決方案,我必須先做一些性能測試。 –

4

編輯,包括另一個版本我認爲這是更可靠一點,還與重疊範圍

CREATE TABLE #data (start_range INT, end_range INT) 
INSERT INTO #data VALUES (1,8) 
INSERT INTO #data VALUES (2,15) 
INSERT INTO #data VALUES (9,13) 
INSERT INTO #data VALUES (14,20) 
INSERT INTO #data VALUES (13,26) 
INSERT INTO #data VALUES (12,21) 
INSERT INTO #data VALUES (9,25) 
INSERT INTO #data VALUES (20,25) 
INSERT INTO #data VALUES (30,42) 
INSERT INTO #data VALUES (42,49) 
INSERT INTO #data VALUES (60,67) 

;with ranges as 
(
SELECT start_range as level 
,end_range as end_range 
,row_number() OVER (PARTITION BY (SELECT NULL) ORDER BY start_range) as row 
FROM #data 
UNION ALL 
SELECT 
level + 1 as level 
,end_range as end_range 
,row 
From ranges 
WHERE level < end_range 
) 
,ranges2 AS 
(
SELECT DISTINCT 
level 
FROM ranges 
) 
,ranges3 AS 
(
SELECT 
level 
,row_number() OVER (ORDER BY level) - level as grouping_group 
from ranges2 
) 
SELECT 
MIN(level) as start_number 
,MAX(level) as end_number 
FROM ranges3 
GROUP BY grouping_group 
ORDER BY start_number ASC 

我認爲這應該工作 - 可能不會在更大,雖然套特別有效......

CREATE TABLE #data (start_range INT, end_range INT) 
INSERT INTO #data VALUES (1,8) 
INSERT INTO #data VALUES (2,15) 
INSERT INTO #data VALUES (9,13) 
INSERT INTO #data VALUES (14,20) 
INSERT INTO #data VALUES (21,25) 
INSERT INTO #data VALUES (30,42) 
INSERT INTO #data VALUES (42,49) 
INSERT INTO #data VALUES (60,67) 


;with overlaps as 
(
select * 
,end_range - start_range as range 
,row_number() OVER (PARTITION BY (SELECT NULL) ORDER BY start_range ASC) as line_number 
from #data 
) 
,overlaps2 AS 
(
SELECT 
O1.start_range 
,O1.end_range 
,O1.line_number 
,O1.range 
,O2.start_range as next_range 
,CASE WHEN O2.start_range - O1.end_range < 2 THEN 1 ELSE 0 END as overlap 
,O1.line_number - DENSE_RANK() OVER (PARTITION BY (CASE WHEN O2.start_range - O1.end_range < 2 THEN 1 ELSE 0 END) ORDER BY O1.line_number ASC) as overlap_group 
FROM overlaps O1 
LEFT OUTER JOIN overlaps O2 on O2.line_number = O1.line_number + 1 
) 
SELECT 
MIN(start_range) as range_start 
,MAX(end_range) as range_end 
,MAX(end_range) - MIN(start_range) as range_span 
FROM overlaps2 
GROUP BY overlap_group 
+0

+1在這裏測試,它的工作。好東西你包含了CREATE和INSERT語句。 –

+0

嗨達文, 您的第二個解決方案更可靠,因爲第一個解決方案效果不佳。實際上,原始表格不包含任何重疊。如果你有任何想法如何解決這個問題,而不是以更有效的方式重疊,我想知道。 Tx爲你提供幫助:) –

+0

@Liran Ben Yehuda - 在你原來的問題中,你想要1-10,11-20和1-10,10-20的範例給出1-20的範圍 - 因此存在重疊在第二種情況下,10會出現兩次,這是否意味着在實際表中,每個開始和結束範圍值都是唯一的? – Dibstar

3

工作,你可以使用一個number table來解決這個問題。基本上,你首先擴大範圍,然後組合後續項目。

這裏有一個實現:

WITH data (START, [END]) AS (
    SELECT 1, 8 UNION ALL 
    SELECT 9, 13 UNION ALL 
    SELECT 14, 20 UNION ALL 
    SELECT 20, 25 UNION ALL 
    SELECT 30, 42 UNION ALL 
    SELECT 42, 49 UNION ALL 
    SELECT 60, 67 
), 
expanded AS (
    SELECT DISTINCT 
    N = d.START + v.number 
    FROM data d 
    INNER JOIN master..spt_values v ON v.number BETWEEN 0 AND d.[END] - d.START 
    WHERE v.type = 'P' 
), 
marked AS (
    SELECT 
    N, 
    SeqID = N - ROW_NUMBER() OVER (ORDER BY N) 
    FROM expanded 
) 
SELECT 
    START = MIN(N), 
    [END] = MAX(N) 
FROM marked 
GROUP BY SeqID 

該解決方案使用master..spt_values作爲一個數字表,爲擴大初始範圍。但是,如果(全部或部分)的範圍可以跨越超過2048個(隨後)值,那麼你應該定義和使用your own號碼錶。