2017-02-10 56 views
0

我現在有一個查詢,運行良好,但將有縮放問題。我發現的解決方案非常慢。我期待加快第二個查詢。Postgresql IN子句與嵌套SELECT與JOIN性能

,將無法很好地擴展舊的查詢:

SELECT user.score 
FROM users 
WHERE 
    user.id IN (
    SELECT user_id 
    FROM companies_users 
    ON companies_users.company_id = X 
) 

然後我會在不同的分數循環將它們分組。得分範圍從-10到10.問題來自IN SELECT語句和迭代。可能有超過一百萬個user_ids被返回。

替代我來了應該變得更好,但瘋狂慢:

SELECT 
    COUNT(*) as total_scores, 
    (SELECT COUNT(*) FROM users 
    JOIN companies_users as cu ON cu.company_id = cu.user_id 
    WHERE users.score = 10 AND cu.company_id = X) as "10", 
    (SELECT COUNT(*) FROM users 
    JOIN companies_users as cu ON cu.company_id = cu.user_id 
    WHERE users.score = 9 AND cu.company_id = X) as "9", 
... 
    (SELECT COUNT(*) FROM users 
    JOIN companies_users as cu ON cu.company_id = cu.user_id 
    WHERE users.score = -9 AND cu.company_id = X) as "-9", 
    (SELECT COUNT(*) FROM users 
    JOIN companies_users as cu ON cu.company_id = cu.user_id 
    WHERE users.score = -10 AND cu.company_id = X) as "-10" 
FROM users 
    JOIN companies_users as cu ON cu.company_id = cu.user_id 
    WHERE cu.company_id = X 

第一個查詢需要反覆進入工作數據。第二個很好走。

有沒有辦法將JOIN從嵌套的SELECT中拉出來?這似乎導致第二個查詢中的大部分放緩。另外,我是否對第一個查詢在處理數百萬個ID時不能很好地進行擴展?

回答

1

,會是什麼問題:

SELECT u.score 
FROM companies_users cu 
    JOIN users u ON cu.user_id = u.id 
WHERE cu.company_id=? 
GROUP BY u.score 
ORDER BY u.score 

此外,你有適當的指數?您需要companies_users(company_id)上的索引,以及users(id)上的索引。您可以嘗試在company_users(user_id)上添加一個,以防計劃者決定以相反方式執行查詢。 EXPLAINEXPLAIN ANALYZE是你的朋友。

+0

感謝您的回覆!這非常接近完美。我其實在尋找不同分數的數字。我用你的解決方案,但將選擇部分改爲u.score,count(u.score)並獲得所有數據!再次感謝。 – amiksch