在我們的API中,我們有一個基本的排名/排行榜功能,其中每個客戶端用戶都有其可以執行的「操作」列表,每個操作都會得到一個分數,並且所有操作都會記錄在「動作」表,然後每個用戶可以要求當前月份的排行榜(每個月排行榜重置)。沒有什麼花哨。如何優化PostgreSQL排行榜窗口函數查詢
我們有兩個表:與用戶表,並用行動表(我已經刪除不相關的列):
> \d client_users
Table "public.client_users"
Column | Type | Modifiers
------------------------+-----------------------------+-----------------------------------------------------------
id | integer | not null default nextval('client_users_id_seq'::regclass)
app_id | integer |
user_id | character varying | not null
created_at | timestamp without time zone |
updated_at | timestamp without time zone |
Indexes:
"client_users_pkey" PRIMARY KEY, btree (id)
"index_client_users_on_app_id" btree (app_id)
"index_client_users_on_user_id" btree (user_id)
Foreign-key constraints:
"client_users_app_id_fk" FOREIGN KEY (app_id) REFERENCES apps(id)
Referenced by:
TABLE "leaderboard_actions" CONSTRAINT "leaderboard_actions_client_user_id_fk" FOREIGN KEY (client_user_id) REFERENCES client_users(id)
> \d leaderboard_actions
Table "public.leaderboard_actions"
Column | Type | Modifiers
----------------+-----------------------------+------------------------------------------------------------------
id | integer | not null default nextval('leaderboard_actions_id_seq'::regclass)
client_user_id | integer |
score | integer | not null default 0
created_at | timestamp without time zone |
updated_at | timestamp without time zone |
Indexes:
"leaderboard_actions_pkey" PRIMARY KEY, btree (id)
"index_leaderboard_actions_on_client_user_id" btree (client_user_id)
"index_leaderboard_actions_on_created_at" btree (created_at)
Foreign-key constraints:
"leaderboard_actions_client_user_id_fk" FOREIGN KEY (client_user_id) REFERENCES client_users(id)
我試圖優化查詢如下:
SELECT
cu.user_id,
SUM(la.score) AS total_score,
rank() OVER (ORDER BY SUM(la.score) DESC) AS ranking
FROM client_users cu
JOIN leaderboard_actions la ON cu.id = la.client_user_id
WHERE cu.app_id = 8
AND la.created_at BETWEEN '2017-07-01 00:00:00.000000' AND '2017-07-31 23:59:59.999999'
GROUP BY cu.id
ORDER BY total_score DESC
LIMIT 20;
注:client_users.user_id是VARCHAR 「人ID」,該表的連接與client_user.id外鍵(命名也不是很大,我知道:d)
基本上,我要求PostgreSQL給我排名前20位的用戶在當月的個人行爲總分排名。
你可以從查詢計劃中看到的不是那麼快:
Limit (cost=8641.96..8642.05 rows=20 width=52) (actual time=135.544..135.560 rows=20 loops=1)
Output: cu.user_id, (sum(la.score)), (rank() OVER (?)), cu.id
-> WindowAgg (cost=8641.96..8841.42 rows=44326 width=52) (actual time=135.543..135.559 rows=20 loops=1)
Output: cu.user_id, (sum(la.score)), rank() OVER (?), cu.id
-> Sort (cost=8641.96..8664.12 rows=44326 width=44) (actual time=135.538..135.539 rows=20 loops=1)
Output: (sum(la.score)), cu.id, cu.user_id
Sort Key: (sum(la.score)) DESC
Sort Method: quicksort Memory: 1451kB
-> HashAggregate (cost=7824.77..7957.75 rows=44326 width=44) (actual time=130.938..133.124 rows=10411 loops=1)
Output: sum(la.score), cu.id, cu.user_id
Group Key: cu.id
-> Hash Join (cost=5858.66..7780.44 rows=44326 width=40) (actual time=50.849..111.346 rows=79382 loops=1)
Output: cu.id, cu.user_id, la.score
Hash Cond: (la.client_user_id = cu.id)
-> Index Scan using index_leaderboard_actions_on_created_at on public.leaderboard_actions la (cost=0.09..1736.77 rows=69494 width=8) (actual time=0.020..33.773 rows=79382 loops=1)
Output: la.id, la.client_user_id, la.rule_id, la.score, la.created_at, la.updated_at, la.success
Index Cond: ((la.created_at >= '2017-07-01 00:00:00'::timestamp without time zone) AND (la.created_at <= '2017-07-31 23:59:59.999999'::timestamp without time zone))
-> Hash (cost=5572.11..5572.11 rows=81846 width=36) (actual time=50.330..50.330 rows=81859 loops=1)
Output: cu.user_id, cu.id
Buckets: 131072 Batches: 1 Memory Usage: 6583kB
-> Seq Scan on public.client_users cu (cost=0.00..5572.11 rows=81846 width=36) (actual time=0.014..34.539 rows=81859 loops=1)
Output: cu.user_id, cu.id
Filter: (cu.app_id = 8)
Rows Removed by Filter: 46610
Planning time: 1.276 ms
Execution time: 136.176 ms
(26 rows)
爲了讓你的尺寸的想法:
- client_users大約有128471行,只有81860通過有針對性的查詢(app_id = 8)
- leaderboard_actions在當月有1609992行和79435
任何想法?
謝謝!
不同意你:由於你要求的信息量多,計劃*速度很快。 – joanolo