2014-03-03 77 views
7

我發現連續幾天的許多stackoverflow QnAs。
仍然回答太短,我不明白是怎麼回事。sql連續天

爲具體,我會做一個模型(或表)
(我使用PostgreSQL,如果它有差別。)

CREATE TABLE work (
    id integer NOT NULL, 
    user_id integer NOT NULL, 
    arrived_at timestamp with time zone NOT NULL 
); 


insert into work(user_id, arrived_at) values(1, '01/03/2011'); 
insert into work(user_id, arrived_at) values(1, '01/04/2011'); 
  1. (最簡單的形式)爲給定的用戶,我想找到最後連續的日期範圍。

  2. (我的最終目標)對於一個給定的用戶,我想找到他的連續工作日。
    如果他昨天來上班,他仍然(截至今天)有連續工作日的機會。所以我昨天連續幾天給他看。
    但是如果他昨天錯過了,他的連續日子是0還是1,這取決於他今天是否來了。

說今天是第8天。

3 * 5 6 7 * = 3 days (5 to 7) 
3 * 5 6 7 8 = 4 days (5 to 8) 
3 4 5 * 7 * = 1 day (7 to 7) 
3 * * * * * = 0 day 
3 * * * * 8 = 1 day (8 to 8) 
+1

有趣的問題......你可以請加表格的架構? –

+2

模式和樣本數據(以'CREATE TABLE'和'INSERT's)和預期結果請。 –

+0

請添加真實的DDL +樣本數據。請不要使用簡寫符號。 – joop

回答

2

這是我解決這個問題使用CTE

WITH RECURSIVE CTE(attendanceDate) 
AS 
(
    SELECT * FROM 
    (
     SELECT attendanceDate FROM attendance WHERE attendanceDate = current_date 
     OR attendanceDate = current_date - INTERVAL '1 day' 
     ORDER BY attendanceDate DESC 
     LIMIT 1 
    ) tab 
    UNION ALL 

    SELECT a.attendanceDate FROM attendance a 
    INNER JOIN CTE c 
    ON a.attendanceDate = c.attendanceDate - INTERVAL '1 day' 
) 
SELECT COUNT(*) FROM CTE; 

檢查代碼在SQL Fiddle

這裏是如何查詢工作:

  1. 它選擇當前記錄從attendance表。如果今天的戰績是不可用,則它
  2. 然後,它不斷將遞歸地記錄至少日期

前一天,如果你想不論何時是用戶的最新出勤的選擇最新的連續的日期範圍(今天選擇昨天的記錄昨天或x天前),然後CTE的初始化部分必須由以下替換片段:

SELECT MAX(attendanceDate) FROM attendance 

[編輯] 這裏是一個SQL查詢撥弄它解決你的問題#1:SQL Fiddle

+0

你能給我原來的小提琴似乎解決了我的問題#1嗎? (沒有今天/昨天的考慮),以便我可以首先理解你的查詢的基礎知識? – eugene

+0

http://www.sqlfiddle.com/#!15/7016f/1 –

+0

如果用戶每天可以多次出席一次,請參閱編輯 –

0
-- some data 
CREATE table dayworked (
     id SERIAL NOT NULL PRIMARY KEY 
     , user_id INTEGER NOT NULL 
     , arrived_at DATE NOT NULL 
     , UNIQUE (user_id, arrived_at) 
     ); 

INSERT INTO dayworked(user_id, arrived_at) VALUES 
(1, '2014-02-03') 
,(1, '2014-02-05') 
,(1, '2014-02-06') 
,(1, '2014-02-07') 
     -- 
,(2, '2014-02-03') 
,(2, '2014-02-05') 
,(2, '2014-02-06') 
,(2, '2014-02-07') 
,(2, '2014-02-08') 
     -- 
,(3, '2014-02-03') 
,(3, '2014-02-04') 
,(3, '2014-02-05') 
,(3, '2014-02-07') 
     -- 
,(5, '2014-02-08') 
     ; 

-- The query 
WITH RECURSIVE stretch AS (
     SELECT dw.user_id AS user_id 
       , dw.arrived_at AS first_day 
       , dw.arrived_at AS last_day 
       , 1::INTEGER AS nday 
     FROM dayworked dw 
     WHERE NOT EXISTS (-- Find start of chain: no previous day 
       SELECT * FROM dayworked nx 
       WHERE nx.user_id = dw.user_id 
       AND nx. arrived_at = dw.arrived_at -1 
       ) 
     UNION ALL 
     SELECT dw.user_id AS user_id 
       , st.first_day AS first_day 
       , dw.arrived_at AS last_day 
       , 1+st.nday AS nday 
     FROM dayworked dw -- connect to chain: previous day := day before this day 
     JOIN stretch st ON st.user_id = dw.user_id AND st.last_day = dw.arrived_at -1 
     ) 
SELECT * FROM stretch st 
WHERE (st.nday > 1 OR st.first_day = NOW()::date) -- either more than one consecutive dat or starting today 
AND NOT EXISTS (-- Only the most recent stretch 
     SELECT * FROM stretch nx 
     WHERE nx.user_id = st .user_id 
     AND nx.first_day > st.first_day 
     ) 
AND NOT EXISTS (-- omit partial chains 
     SELECT * FROM stretch nx 
     WHERE nx.user_id = st .user_id 
     AND nx.first_day = st.first_day 
     AND nx.last_day > st.last_day 
     ) 
     ; 

結果:

CREATE TABLE 
INSERT 0 14 
user_id | first_day | last_day | nday 
---------+------------+------------+------ 
     1 | 2014-02-05 | 2014-02-07 | 3 
     2 | 2014-02-05 | 2014-02-08 | 4 
(2 rows) 
0

您可以創建的範圍類型的集合:

Create function sfunc (tstzrange, timestamptz) 
    returns tstzrange 
    language sql strict as $$ 
     select case when $2 - upper($1) <= '1 day'::interval 
       then tstzrange(lower($1), $2, '[]') 
       else tstzrange($2, $2, '[]') end 
    $$; 

Create aggregate consecutive (timestamptz) (
     sfunc = sfunc, 
     stype = tstzrange, 
     initcond = '[,]' 
); 

用的骨料與正確的順序得到最後arrived_at的連續第二天範圍:

Select user_id, consecutive(arrived_at order by arrived_at) 
    from work 
    group by user_id; 

    ┌─────────┬─────────────────────────────────────────────────────┐ 
    │ user_id │      consecutive      │ 
    ├─────────┼─────────────────────────────────────────────────────┤ 
    │  1 │ ["2011-01-03 00:00:00+02","2011-01-05 00:00:00+02"] │ 
    │  2 │ ["2011-01-06 00:00:00+02","2011-01-06 00:00:00+02"] │ 
    └─────────┴─────────────────────────────────────────────────────┘ 

在窗口函數中使用聚合函數:

Select *, 
     consecutive(arrived_at) 
       over (partition by user_id order by arrived_at) 
    from work; 

    ┌────┬─────────┬────────────────────────┬─────────────────────────────────────────────────────┐ 
    │ id │ user_id │  arrived_at  │      consecutive      │ 
    ├────┼─────────┼────────────────────────┼─────────────────────────────────────────────────────┤ 
    │ 1 │  1 │ 2011-01-03 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-03 00:00:00+02"] │ 
    │ 2 │  1 │ 2011-01-04 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-04 00:00:00+02"] │ 
    │ 3 │  1 │ 2011-01-05 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-05 00:00:00+02"] │ 
    │ 4 │  2 │ 2011-01-06 00:00:00+02 │ ["2011-01-06 00:00:00+02","2011-01-06 00:00:00+02"] │ 
    └────┴─────────┴────────────────────────┴─────────────────────────────────────────────────────┘ 

查詢的結果中找到你所需要的:

With work_detail as (select *, 
      consecutive(arrived_at) 
        over (partition by user_id order by arrived_at) 
     from work) 
    select arrived_at, upper(consecutive) - lower(consecutive) as days 
     from work_detail 
      where user_id = 1 and upper(consecutive) != lower(consecutive) 
      order by arrived_at desc 
       limit 1; 

    ┌────────────────────────┬────────┐ 
    │  arrived_at  │ days │ 
    ├────────────────────────┼────────┤ 
    │ 2011-01-05 00:00:00+02 │ 2 days │ 
    └────────────────────────┴────────┘ 
0

你甚至可以不用遞歸CTE這樣做:
generate_series()LEFT JOINrow_count()和最終LIMIT 1

1表示「今天」加上連續天數直到「昨天」:

SELECT count(*) -- 1/0 for "today" 
    + COALESCE((-- + optional count of consecutive days up until "yesterday" 
     SELECT ct 
     FROM (
      SELECT d.ct, count(w.arrived_at) OVER (ORDER BY d.ct) AS day_ct 
      FROM generate_series(1, 8) AS d(ct) -- maximum = 8 
      LEFT JOIN work w ON w.arrived_at >= current_date - d.ct 
          AND w.arrived_at < current_date - (d.ct - 1) 
          AND w.user_id = 1 -- given user 
     ) sub 
     WHERE ct = day_ct 
     ORDER BY ct DESC 
     LIMIT 1 
     ), 0) AS total 
FROM work 
WHERE arrived_at >= current_date -- no future timestamps 
AND user_id = 1     -- given user 

假設每天有0或1個條目。應該快。

爲了獲得最佳性能(本或CTE解決方案一樣),你就必須像一個多列索引:

CREATE INDEX foo_idx ON work (user_id,arrived_at); 
+0

這會比CTE解決方案更快嗎? – eugene

+0

@eugene:可能是的。考慮簡化的更新。你可以在你的數據上運行'EXPLAIN ANALYZE'嗎? –

+0

我還沒有足夠大的數據集。並且花了相當長的時間將答案轉換爲我的實際模式。 :( – eugene