我有兩組數據來自外部來源 - 客戶的購買日期和最後一次電子郵件點擊/客戶的開放日期。這分別存儲在兩個表PURCHASE_INTER和ACTIVITY_INTER表中。購買數據爲多個,我需要提取上次購買日期。但活動數據對每個客戶都是唯一的。數據彼此獨立,其他數據集可能不存在。我們編寫了下面的查詢,它將兩個表結合在一起,根據來自外部來源的客戶的id(person_id)進行分組,並獲取最新的日期,加入我們的客戶表以獲取客戶電子郵件,並再次與另一個表爲了知道它是插入還是更新操作,最終將存儲這些數據。您能否建議我如何改進此查詢的性能。這是非常緩慢,超過10個小時。有數百萬條記錄進入PURCHASE_INTER和ACTIVITY_INTER表。提高性能
SELECT INTER.*, C.ID AS CUSTOMER_ID, C.EMAIL AS CUSTOMER_EMAIL, LSI.ID AS INTERACTION_ID, ROW_NUMBER() OVER (ORDER BY PERSON_ID ASC) AS RN FROM (
SELECT PERSON_ID AS PERSON_ID,
MAX(LAST_CLICK_DATE) AS LAST_CLICK_DATE,
MAX(LAST_OPEN_DATE) AS LAST_OPEN_DATE,
MAX(LAST_PURCHASE_DATE) AS LAST_PURCHASE_DATE
FROM (
SELECT ACT.PERSON_ID AS PERSON_ID,
ACT.LAST_CLICK_DATE AS LAST_CLICK_DATE,
ACT.LAST_OPEN_DATE AS LAST_OPEN_DATE,
NULL AS LAST_PURCHASE_DATE
FROM ACTIVITY_INTER ACT
WHERE ACT.JOB_ID = 77318317
UNION
SELECT PUR.PERSON_ID AS PERSON_ID,
NULL AS LAST_CLICK_DATE,
NULL AS LAST_OPEN_DATE,
PUR.LAST_PURCHASE_DATE AS LAST_PURCHASE_DATE
FROM PURCHASE_INTER PUR
WHERE PUR.JOB_ID = 77318317
) GROUP BY PERSON_ID
) INTER LEFT JOIN CUSTOMER C ON INTER.PERSON_ID = C.PERSON_ID
LEFT JOIN INTERACTION LSI ON C.ID = LSI.CUSTOMER_ID;
你需要刪除重複項,還是可以使用'UNION ALL'而不是'UNION'? – jarlh
有多少條記錄符合給定的工作? –
你真的需要提供'RN'列嗎?如果您要返回大量的行,那麼計算起來可能會很昂貴。 –