2017-01-13 48 views
0

問題:兩次連接同一個表後,我得到很多重複的行和很長的查詢。將同一個表連接兩次而不存在重複行

有兩個表:

活動:事項標識,EVENT_TIME,EVENT_NAME TSERIES:data_time,數據

的EVENT_TIME是爲特定事件時,data_time在每分鐘有數據。

我要輸出一個表,其中我將有6列: EVENT_ID,EVENT_TIME,data_time1,DATA1,data_time2,DATA2 其中data_time1/DATA1是用於第2分鐘的情況下(0後,1 ), 和data_time2/data2的下一個2分鐘(3,4)

我的查詢:

SELECT 
* 
FROM events 
LEFT JOIN tseries ts1 
ON ts1.data_time >= (events.event_time) AND ts1.data_time <= (events.event_time + time '00:01:00') 
LEFT JOIN tseries ts2 
ON ts2.data_time >= (events.event_time + time '00:02:00') AND ts2.data_time <= (events.event_time + time '00:03:00') 
ORDER BY events.event_id 
; 

該查詢生成該表(I僅包括時間數據),以及接合後同一張表更多次會更可怕。

event_time data_time1 data_time2 
    x  x+0  x+2 
    x  x+0  x+3 
    x  x+1  x+2 
    x  x+1  x+3 

我寧願這樣的事情,而不是:

event_time data_time1 data_time2 
    x  x+0  x+2 
    x  x+1  x+3 

event_time data_time1 data_time2 
    x  x+0  null 
    x  x+1  null 
    x  null  x+2 
    x  null  x+3 

任何想法?感謝您的幫助/答案:)

+2

你爲什麼加入兩次?難道你只是在將其餘部分放在WHERE子句中時進行一次連接? – SaggingRufus

+3

您可以添加表格的樣本數據嗎? –

+0

「兩次+ WHERE」 - 我不明白你的想法,但我是初學者。沒有爲我工作。感謝小費。 –

回答

0

一種方法是有條件的聚集。 。 。假設每個事件只需要一行:

SELECT e.*, 
     MAX(CASE WHEN ts.data_time >= e.event_time AND ts1.data_time <= e.event_time + time '00:01:00' THEN ts.data END) as data_1, 
     MAX(CASE WHEN ts.data_time >= e.event_time + time '00:02:00' AND ts2.data_time <= e.event_time + time '00:03:00' THEN ts.data END) as data_2 
FROM events e LEFT JOIN 
    tseries ts 
    ON (ts.data_time >= e.event_time AND ts1.data_time <= e.event_time + time '00:01:00') OR 
     (ts.data_time >= e.event_time + time '00:02:00' AND ts2.data_time <= e.event_time + time '00:03:00') 
GROUP BY e.event_id 
ORDER BY e.event_id; 

但是,這不適用於每個時間段內的多個匹配項。

對於多行,一種方法是枚舉每個事件和每個時間段的值。然後,您可以使用該序號進行匹配。下面使用的情況下,FULL JOIN兩個列表中有不同的長度:

SELECT COALESCE(ts1.event_id, ts2.event_id) as event_id, 
     ts1.data, ts2.data 
FROM (SELECT e.event_id, ts1.data, 
      ROW_NUMBER() OVER (PARTITION BY e.event_id ORDER BY ts1.event_time) as seqnum 
     FROM events e JOIN 
      tseries ts1 
      ON ts1.data_time >= e.event_time AND 
       ts1.data_time <= e.event_time + time '00:01:00' 
    ) ts1 FULL JOIN 
    (SELECT e.event_id, ts1.data, 
      ROW_NUMBER() OVER (PARTITION BY e.event_id ORDER BY ts1.event_time) as seqnum 
     FROM events e JOIN 
      tseries ts2 
      ON ts1.data_time >= e.event_time + time '00:02:00' AND 
       ts1.data_time <= e.event_time + time '00:03:00' 
    ) ts2 
    ON ts1.event_id = ts2.event_id AND ts1.seqnum = ts2.seqnum 
ORDER BY event_id; 

注意:如果你想從event其他字段,那麼你可以使用:

SELECT e.*, 
     ts1.data, ts2.data 
FROM events e LEFT JOIN 
    (SELECT e.event_id, ts1.data, 
      ROW_NUMBER() OVER (PARTITION BY e.event_id ORDER BY ts1.event_time) as seqnum 
     FROM events e JOIN 
      tseries ts1 
      ON ts1.data_time >= e.event_time AND 
       ts1.data_time <= e.event_time + time '00:01:00' 
    ) ts1 
    ON ts1.event_id = e.event_id LEFT JOIN 
    (SELECT e.event_id, ts1.data, 
      ROW_NUMBER() OVER (PARTITION BY e.event_id ORDER BY ts1.event_time) as seqnum 
     FROM events e JOIN 
      tseries ts2 
      ON ts1.data_time >= e.event_time + time '00:02:00' AND 
       ts1.data_time <= e.event_time + time '00:03:00' 
    ) ts2 
    ON e.event_id = ts2.event_id AND ts1.seqnum = ts2.seqnum 
ORDER BY e.event_id; 
+0

第一個適合我的需求做了小的修改,但我會嘗試其他兩個。超級:)謝謝! (我之前沒有遇到過合併和分區) –

0

嘗試

SELECT distinct 
* 
FROM events 
LEFT JOIN tseries ts1 
ON ts1.data_time >= (events.event_time) AND ts1.data_time <= (events.event_time + time '00:01:00') 
LEFT JOIN tseries ts2 
ON ts2.data_time >= (events.event_time + time '00:02:00') AND ts2.data_time <= (events.event_time + time '00:03:00') 
ORDER BY events.event_id; 
+0

使查詢非常長或無限?不知道,我沒有等到最後。但謝謝你的提示。 –