2012-09-05 60 views
7

我有一個包含類別,日期和費率的表。每個類別在不同的日期可以有不同的費率,一個類別在給定的日期只能有一個費率。如何使用MySQL對連續範圍進行分組

Id  CatId Date  Rate 
------ ------ ------------ --------- 
000001  12 2009-07-07  1 
000002  12 2009-07-08  1 
000003  12 2009-07-09  1 
000004  12 2009-07-10  2 
000005  12 2009-07-15  1 
000006  12 2009-07-16  1 
000007  13 2009-07-08  1 
000008  13 2009-07-09  1 
000009  14 2009-07-07  2 
000010  14 2009-07-08  1 
000010  14 2009-07-10  1 

唯一索引(CATID,日期,價格) 我想對於每個類別組的所有連續日期範圍,只保留開始和範圍的結束。 對於前面的例子中,我們將有:

CatId Begin   End   Rate 
------ ------------ ------------ --------- 
12  2009-07-07 2009-07-09  1 
12  2009-07-10 2009-07-10  2 
12  2009-07-15 2009-07-16  1 
13  2009-07-08 2009-07-09  1 
14  2009-07-07 2009-07-07  2 
14  2009-07-08 2009-07-08  1 
14  2009-07-10 2009-07-10  1 

我發現the forum類似的解決方案,它並沒有完全放棄的結果

WITH q AS 
     (
     SELECT *, 
       ROW_NUMBER() OVER (PARTITION BY CatId, Rate ORDER BY [Date]) AS rnd, 
       ROW_NUMBER() OVER (PARTITION BY CatId ORDER BY [Date]) AS rn 
     FROM my_table 
     ) 
SELECT CatId AS catidd, MIN([Date]) as beginn, MAX([Date])as endd, Rate 
FROM q 
GROUP BY CatId, rnd - rn, Rate 

查閱SQL FIDDLE 我如何做同樣的事情在MySQL ? 請幫忙!

+0

爲什麼你的例子顯示了'(CATID,率)=( 14,1)'當基礎表中沒有'2009-07-09'時,從'2009-07-08'到'2009-07-10'的結果範圍? C.F. (CatId,Rate)=(12,1)',由於它的不連續性,它會產生兩個結果範圍。 – eggyal

+0

感謝eggyal,現在它已更正 – Fouzi

回答

6

MySQL不支持分析功能,但你可以模擬與user-defined variables這樣的行爲:

SELECT CatID, Begin, MAX(Date) AS End, Rate 
FROM (
    SELECT my_table.*, 
      @f:=CONVERT(
      IF(@c<=>CatId AND @r<=>Rate AND DATEDIFF(Date, @d)=1, @f, Date), DATE 
      ) AS Begin, 
      @c:=CatId, @d:=Date, @r:=Rate 
    FROM  my_table JOIN (SELECT @c:=NULL) AS init 
    ORDER BY CatId, Rate, Date 
) AS t 
GROUP BY CatID, Begin, Rate 

看到它的sqlfiddle

+0

似乎按預期工作!非常感謝! – Fouzi

+1

'<=>'是什麼意思? –

+1

@vanabel:這是MySQL的[NULL-safe等於運算符](http://dev.mysql.com/doc/en/comparison-operators.html#operator_equal-to)。 – eggyal

3
SELECT catid,min(ddate),max(ddate),rate 
FROM (
    SELECT 
     Catid, 
     Ddate, 
     rate, 
     @rn := CASE WHEN (@prev <> rate 
      or DATEDIFF(ddate, @prev_date)>1) THEN @rn+1 ELSE @rn END AS rn, 
     @prev := rate, 
     @prev_id := catid , 
     @prev_date :=ddate 
    FROM (
     SELECT CatID,Ddate,rate 
     FROM rankdate 
     ORDER BY CatID, Ddate) AS a , 
     (SELECT @prev := -1, @rn := 0, @prev_id:=0 ,@prev_date:=-1) AS vars  

) T1 group by catid,rn 

注:線(SELECT @prev:= -1,@Rn:= 0,@prev_id:= 0,@ prev_date:= - 1)AS瓦爾沒有必要在MySQL工作區,但它在PHP的mysql_query函數中。

SQL FIDDLE HERE

+0

如果我們刪除ID ='000004'的記錄您的查詢返回(開始:2009-07-07,結束:2009-07-16,比率:1),這是不正確的,因爲有一個差距,應該返回(開始:2009-07-07,結束:2009-07-09,費率:1)和(開始:2009-07-15,結束:2009-07-16,費用:1)。 [SQL FIDDLE HERE](http://sqlfiddle.com/#!2/513b2/1) – Fouzi

+0

@BoussahelBachir,我編輯了答案。在這種情況下,需要包括您提到的情況以適應您的情況。 – sel

+0

您似乎沒有在任何地方測試'@ prev_id' ...如果具有相同'Rate'的兩個連續日期具有不同的'CatId'會發生什麼? – eggyal

0

我知道我很晚了,仍然發佈了一個適合我的解決方案。 有同樣的問題,這裏就是我得到了它

發現使用變量

SELECT MIN(id) AS id, MIN(date) AS date, MIN(state) AS state, COUNT(*) cnt 
FROM (
    SELECT @r := @r + (@state != state OR @state IS NULL) AS gn, 
      @state := state AS sn, 
      s.id, s.date, s.state 
    FROM (
      SELECT @r := 0, 
        @state := NULL 
      ) vars, 
      t_range s 
    ORDER BY 
      date, state 
    ) q 
GROUP BY gn 

更多細節在一個很好的解決方案:https://explainextended.com/2009/07/24/mysql-grouping-continuous-ranges/