2016-12-12 50 views
0

我有四個表從user: first_namemongouser: email, card_statustransaction: transaction_type, balance, posted_at, is_atm, is_purchaseuser_login: user_id, login_date, login_id ...如何優化「加入」 PostgreSQL中

之前我加入了第四個表中提取信息 - user_login,一切都是有效的。然而,第四個JOIN使一切變得緩慢。我寫了查詢,如下圖所示

SELECT * FROM 
(SELECT 
ssluserid, 
first_name, 
m.email, 
zipcode, 
date_part('year',age(birthday)) AS birthday, 
(current_date - DATE(created_date)) AS duration, 
CASE WHEN card_status = 'ACTIVE' THEN 1 ELSE 0 END AS IS_ACTIVE, 
SUM(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN balance END) AS LOAD_AMT, 
SUM(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN balance END) AS SPEND_AMT, 
COUNT(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN balance END) AS LOAD_CT, 
COUNT(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN balance END) AS SPEND_CT, 
MIN(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN DATE(posted_at) END) AS FIRST_LOAD, 
MAX(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN DATE(posted_at) END) AS LAST_LOAD, 
MIN(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN DATE(posted_at) END) AS FIRST_SPEND, 
MAX(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN DATE(posted_at) END) AS LAST_SPEND, 
    SUM(CASE WHEN transaction_type = 'Debit' AND is_atm = 't' AND DATE(posted_at) >= CURRENT_DATE - INTERVAL '90 days' 
            THEN balance END) AS ATM_AMT, 
    SUM(CASE WHEN transaction_type = 'Debit' AND is_purchase = 't' AND DATE(posted_at) >= CURRENT_DATE - INTERVAL '90 days' 
            THEN balance END) AS POS_AMT, 
    SUM(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 AND DATE(posted_at) >= CURRENT_DATE - INTERVAL '90 days' 
            THEN balance END) AS LOAD_VOL, 
    COUNT(CASE WHEN DATE(login_date) >= CURRENT_DATE - INTERVAL '90 days' THEN 
login_id END) AS CT_LOGIN 
FROM 
mongouser m 
LEFT OUTER JOIN 
user u 
ON m.userid = u.id 
LEFT OUTER JOIN transactions t 
ON u.id = t.user_id 
LEFT OUTER JOIN user_login l 
ON m.userid = l.user_id 
GROUP BY 1,2,3,4,5,6,7) t 
WHERE LAST_LOAD >= CURRENT_DATE - INTERVAL '90 days' 
ORDER BY 9 DESC; 

該查詢已經運行了近40分鐘......有沒有優化的任何方式?

+0

是的,有很多方法可以優化它。您可以使用EXPLAIN獲取報告,瞭解查詢中的大額費用,並確定是否有任何地方能夠更好地使用索引,您可以更改表索引,限制您獲取的列數或您提取的行數等等。或者您可以嘗試刪除Debit/Credit標誌並將借記存儲爲負值,然後您可以刪除所有CASE資料。你有沒有試圖自己找到任何優化或至少研究你的查詢被阻止的地方? – GordonM

+0

@GordonM謝謝你的提示! – YOBOX

回答

1

專注於你的陳述,你知道問題出在哪裏。你之前有過這個

LEFT OUTER JOIN user u 
ON m.userid = u.id 

而你說事情「不慢」。然後你添加這個,

LEFT OUTER JOIN user_login l 
ON m.userid = l.user_id 

而你說事情變得緩慢。您很有可能在m.userid上有索引。你有l.user_id的索引嗎?

CREATE INDEX foo ON user_login (user_id); 
+0

你是對的。 mongouser表中的用戶標識不具有索引,而user_login表中的user_id具有索引。但是,請你解釋一下爲什麼添加user_login表可能會使一切變慢?在添加此表之前是22秒。 – YOBOX

+0

22秒已經很慢了。我不知道用戶,但是如果你沒有索引,你必須通過永久表運行,並在嵌套循環(如果有序)中加入它們,或者將它們加入到位圖連接中。兩者都很慢。 –

+0

如果您滿意,請將此答案標記爲已接受。 –