2016-04-24 61 views
1

我正在計算我的LaserSheet模型的統計信息,爲儀表板頁面構建morris.js圖表​​。我現在有這一個統計工作:Rails:按天計算統計信息和分組

# Show four Mondays ago up to this coming Sunday (4 weeks) 
start_date = Time.zone.now.beginning_of_week - 3.weeks 
end_date = Time.zone.now.end_of_week 

# Calculate sheets cut per day 
empty_dates_hash = Hash[(start_date.to_date..end_date.to_date).collect { |v| [v, 0] }] 
recent_cut_stats = LaserSheet.where('cut_at IS NOT NULL') 
          .where('cut_at > ?', start_date.beginning_of_day) 
          .where('cut_at < ?', end_date.end_of_day) 
          .group("DATE(cut_at::TIMESTAMPTZ AT TIME ZONE '#{Time.zone.now.formatted_offset}'::INTERVAL)") 
          .count 
recent_cut_stats = empty_dates_hash.merge(recent_cut_stats) 

我想補充一個歷史的「張左切」統計按天分組。爲了實現這個目標,我需要找到所有日期爲created_at的日期的所有LaserSheets,其中cut_atNULL或晚於該日期。

我可以手動昨天這樣做:

LaserSheet.where('created_at < ?', Time.zone.yesterday.end_of_day) 
      .where('cut_at IS NULL OR cut_at > ?', Time.zone.yesterday.end_of_day) 
      .count 

和今天:

LaserSheet.where('created_at < ?', Time.zone.today.end_of_day) 
      .where('cut_at IS NULL OR cut_at > ?', Time.zone.today.end_of_day) 
      .count 

我可以重複這一過程,每一天[start_date..end_date]但是這是非常低效的。有一種方法可以通過一個數據庫查詢來完成此操作嗎?這看起來並不像每天簡單的分組那麼簡單。

我使用PostgreSQL和Rails 4

回答

0

我將無法遠程寫入效率,甚至語法有效的Ruby代碼,但這裏是一些原始的SQL語句,可以幫助你。

您應該使用generate_series生成日期列表和左加入這個名單上:

SELECT * 
FROM generate_series(
    date_trunc('day', now()) - CAST('7 day' AS interval), 
    date_trunc('day', now()), 
    CAST('1 day' AS interval) 
); 

    generate_series 
------------------------ 
2016-04-17 00:00:00+00 
2016-04-18 00:00:00+00 
2016-04-19 00:00:00+00 
2016-04-20 00:00:00+00 
2016-04-21 00:00:00+00 
2016-04-22 00:00:00+00 
2016-04-23 00:00:00+00 
2016-04-24 00:00:00+00 
(8 rows) 

現在你已經知道如何生成一串日期,所有你所要做的就是加入這些用正確的條款。

但首先,我們需要一些測試數據:

SELECT 
    CAST(created_at AS timestamp), 
    CAST(cut_at AS timestamp) 
FROM (
    VALUES 
     ('2016-04-20', null),   /* not cut yet */ 
     ('2016-04-20', '2016-04-22'), /* cut 2 days ago */ 
     ('2016-04-20', null),   /* not cut yet */ 
     ('2016-04-23', '2016-04-23'), /* cut yesterday */ 
     ('2016-04-23', null),   /* not cut yet */ 
     ('2016-04-24', '2016-04-24'), /* cut today */ 
     ('2016-04-24', '2016-04-26') /* cut tomorrow (because I can :p) */ 
) as laser_sheet(created_at, cut_at); 


    created_at  |  cut_at 
---------------------+--------------------- 
2016-04-20 00:00:00 | 
2016-04-20 00:00:00 | 2016-04-22 00:00:00 
2016-04-20 00:00:00 | 
2016-04-23 00:00:00 | 2016-04-23 00:00:00 
2016-04-23 00:00:00 | 
2016-04-24 00:00:00 | 2016-04-24 00:00:00 
2016-04-24 00:00:00 | 2016-04-26 00:00:00 
(7 rows) 

而最終查詢應該是這樣的:

WITH date_serie AS (
    /* generate one row by day for the last 7 days */ 
    SELECT generate_series as day 
    FROM generate_series(
     /* replace "CAST('2016-04-24 16:56:23' AS datetime)" with "now()" to get a dynamic view */ 
     date_trunc('day', CAST('2016-04-24 16:56:23' AS timestamp)) - CAST('7 day' AS interval), 
     date_trunc('day', CAST('2016-04-24 16:56:23' AS timestamp)), 
     CAST('1 day' AS interval) 
    ) 
), 
laser_sheet AS (
    /* below is some test data */ 
    SELECT 
     CAST(created_at AS timestamp) AS created_at, 
     CAST(cut_at AS timestamp) AS cut_at 
    FROM (
     VALUES 
      ('2016-04-20', null),   /* not cut yet */ 
      ('2016-04-20', '2016-04-22'), /* cut 2 days ago */ 
      ('2016-04-20', null),   /* not cut yet */ 
      ('2016-04-23', '2016-04-23'), /* cut yesterday */ 
      ('2016-04-23', null),   /* not cut yet */ 
      ('2016-04-24', '2016-04-24'), /* cut today */ 
      ('2016-04-24', '2016-04-26') /* cut tomorrow (because I can :p) */ 
    ) as laser_sheet(created_at, cut_at) 
) 
SELECT 
    date_serie.day, 
    /* we need to count if any laser_sheet matches this day */ 
    count(laser_sheet.*) as sheets_left_to_cut 
FROM 
    date_serie 
    LEFT JOIN laser_sheet 
    /* notice here your custom join clause */ 
    ON laser_sheet.created_at < date_serie.day 
    AND (
     laser_sheet.cut_at IS NULL 
     OR laser_sheet.cut_at > date_serie.day 
    ) 
GROUP BY 
    date_serie.day 
ORDER BY 
    date_serie.day 
; 

這裏是結果

  day   | sheets_left_to_cut 
---------------------+-------------------- 
2016-04-17 00:00:00 |     0 
2016-04-18 00:00:00 |     0 
2016-04-19 00:00:00 |     0 
2016-04-20 00:00:00 |     0 
2016-04-21 00:00:00 |     3 
2016-04-22 00:00:00 |     2 
2016-04-23 00:00:00 |     2 
2016-04-24 00:00:00 |     3 
(8 rows)