2012-12-05 65 views
2

假設表格如下:如何在Postgres查詢中填充時間戳空白?

CREATE TABLE channel1m (
    ts TIMESTAMP WITHOUT TIME ZONE NOT NULL, 
    itemId BIGINT, 
    value BIGINT 
) 

在其中一排可以被插入每個分鐘,每的itemId,如下所示:

ts     itemId   value 
2012-12-03 15:29:00 100   1 
2012-12-03 15:30:00 100   2 
2012-12-03 15:30:00 101   0 
2012-12-03 15:32:00 100   1 
2012-12-03 15:32:00 101   1 

我不能找到一種方法(無創建額外的表格)通過返回NULL的值來編寫填充時間間隔的查詢(例如,15:21:00爲itemId 101,15:31:00爲兩個項目)。

預期的結果集是:

ts     itemId   value 
2012-12-03 15:29:00 100   1 
2012-12-03 15:29:00 101   NULL 
2012-12-03 15:30:00 100   2 
2012-12-03 15:30:00 101   0 
2012-12-03 15:31:00 100   NULL 
2012-12-03 15:31:00 101   NULL 
2012-12-03 15:32:00 100   1 
2012-12-03 15:32:00 101   1 

我發現有時間戳的全系列獨立時間表的解決方案,但我更希望在查詢獨自解決這個問題。這可能嗎?

+3

左加入日曆表,可以通過generate_series(min(ts),max(ts))生成 – wildplasser

+0

@wildplasser:你應該做出答案。 –

+0

我正在研究它......包括(嵌套!)CTE – wildplasser

回答

6
DROP SCHEMA tmp CASCADE; 
CREATE SCHEMA tmp ; 
SET search_path = tmp; 

DROP TABLE IF EXISTS channel1m CASCADE; 
CREATE TABLE channel1m (
    zts TIMESTAMP WITHOUT TIME ZONE NOT NULL, 
    zitemid BIGINT, 
    zvalue BIGINT 
); 

-- in which a row may be inserted each minute, per zitemid, as follows: 

INSERT INTO channel1m(zts, zitemid, zvalue) VALUES 
('2012-12-03 15:29:00', 100,   1) 
,('2012-12-03 15:30:00', 100,   2) 
,('2012-12-03 15:30:00', 101,   0) 
,('2012-12-03 15:32:00', 100,   1) 
,('2012-12-03 15:32:00', 101,   1) 
     ; 

     -- CTE to the rescue!!! 
WITH cal AS (
     WITH mm AS (
       SELECT MIN(xx.zts) AS minmin, MAX(xx.zts) AS maxmax 
       FROM channel1m xx) 
     SELECT generate_series(mm.minmin , mm.maxmax , '1 min'::interval) AS stamp 
     FROM mm 
     ) 
, ite AS (
     SELECT DISTINCT zitemid AS zitemid 
     FROM channel1m 
     ) 
SELECT cal.stamp 
     , ite.zitemid 
     , tab.zvalue 
FROM cal 
JOIN ite ON 1=1 -- Note: this is a cartesian product of the {time,id} -domains 
LEFT JOIN channel1m tab ON tab.zts = cal.stamp AND tab.zitemid = ite.zitemid 
ORDER BY stamp ASC 
     ; 

輸出:

NOTICE: drop cascades to table tmp.channel1m 
DROP SCHEMA 
CREATE SCHEMA 
SET 
NOTICE: table "channel1m" does not exist, skipping 
DROP TABLE 
CREATE TABLE 
INSERT 0 5 
     stamp  | zitemid | zvalue 
---------------------+---------+-------- 
2012-12-03 15:29:00 |  101 |  
2012-12-03 15:29:00 |  100 |  1 
2012-12-03 15:30:00 |  100 |  2 
2012-12-03 15:30:00 |  101 |  0 
2012-12-03 15:31:00 |  100 |  
2012-12-03 15:31:00 |  101 |  
2012-12-03 15:32:00 |  100 |  1 
2012-12-03 15:32:00 |  101 |  1 
(8 rows) 
+0

非常好,謝謝! – luisfarzati

+0

我在嘗試解決類似問題時錯過了笛卡爾產品。謝謝! – jriggins

4

您將需要:所有itemId表和所有需要日期的(僞)表。

你可能有所有不同的表格itemId。可以稱它爲item_table

假表與日期,你可以得到generate_series('start_date','end_date', interval '1 minute')。詳情here

查詢:

SELECT gs.ts, it.itemId, ch1m.value 
FROM item_table it 
CROSS JOIN generate_series('start_date','end_date', interval '1 minute') gs(ts) 
LEFT JOIN channel1m ch1m ON it.itemId = ch1m.itemId 
         AND gs.ts = ch1m.ts 

更換'start_date','end_date'用需要的值或子查詢中獲取它們。

這個查詢:

1)經由CROSS JOIN

2構建所有對項目時)獲取value經由LEFT JOIN

1

我認爲最可讀的方法是構建一系列的ta ble表達式。分鐘和物品ID號碼之間的交叉連接會爲您提供每種組合。

with all_minutes as (
    select ('2012-12-03 15:29'::timestamp + 
      (n || ' minute')::interval)::timestamp as ts 
    from generate_series(0,10) n 
), 
item_ids as (
    select distinct itemid from channel1m 
), 
all_items_and_minutes as (
    select all_minutes.ts, item_ids.itemid from all_minutes cross join item_ids 
) 
select all_items_and_minutes.ts, all_items_and_minutes.itemId, channel1m.value 
from all_items_and_minutes 
left join channel1m 
     on all_items_and_minutes.ts = channel1m.ts 
     and all_items_and_minutes.itemid = channel1m.itemid 
order by all_items_and_minutes.ts, all_items_and_minutes.itemid 

您可以用SELECT語句替換時間戳文字以獲得您需要的實際範圍。如果您有一個包含所有唯一項目標識號的不同表格,可能最好從那裏選擇表格,而不是從channel1m表格中選擇不同的值。