1
5 CREATE TABLE t (
6 uuid4 UUID PRIMARY KEY
7 , arr TEXT[]
10 , geom GEOMETRY
11 , ts TIMESTAMP WITHOUT TIME ZONE
12);
13 CREATE INDEX ON t USING GIST (geom);
,看起來像
explain analyze
SELECT kmeans
, count(*)::int
, ST_X(ST_Centroid(ST_Collect(geom))) AS lon
, ST_Y(ST_Centroid(ST_Collect(geom))) AS lat
, STRING_TO_ARRAY(STRING_AGG(ARRAY_TO_STRING(arr, ','), ','), ',') AS arr
FROM (
SELECT kmeans(ARRAY[ST_X(geom), ST_Y(geom)], 25) OVER(), geom, arr
FROM t
WHERE ts > NOW() - '12 hours'::interval
AND geom IS NOT NULL
AND uuid4 != '9ab0f8cd-9707-41da-8e30-6d29a0f22242'::uuid
AND arr @> (SELECT arr FROM t WHERE uuid4 = '9ab0f8cd-9707-41da-8e30-6d29a0f22242'::uuid LIMIT 1)
AND ST_Distance_Sphere(ST_MakePoint(-77, 38), geom) < 10000
) AS ksub
GROUP BY kmeans
ORDER BY kmeans;
在一定距離內從本質上找到的所有行對優化查詢,在某個時間範圍內有geom填充,並且arr包含指定arr中的所有項目。使用kmeans-postgresql聚合函數將找到的這些行集羣。我目前看到
GroupAggregate (cost=347.69..349.59 rows=38 width=98) (actual time=50.034..50.384 rows=25 loops=1)
-> Sort (cost=347.69..347.78 rows=38 width=98) (actual time=49.994..49.999 rows=99 loops=1)
Sort Key: (kmeans(ARRAY[st_x(t.geom), st_y(t.geom)], 25) OVER (?))
Sort Method: quicksort Memory: 42kB
-> WindowAgg (cost=25.18..346.31 rows=38 width=94) (actual time=49.955..49.968 rows=99 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.29..8.30 rows=1 width=62) (actual time=0.018..0.018 rows=1 loops=1)
-> Index Scan using t_uuid4_ts_idx on t t_1 (cost=0.29..8.30 rows=1 width=62) (actual time=0.017..0.017 rows=1 loops=1)
Index Cond: (uuid4 = '9ab0f8cd-9707-41da-8e30-6d29a0f22242'::uuid)
-> Bitmap Heap Scan on t (cost=16.88..337.34 rows=38 width=94) (actual time=13.363..49.747 rows=99 loops=1)
Recheck Cond: (arr @> $0)
Filter: ((geom IS NOT NULL) AND (uuid4 <> '9ab0f8cd-9707-41da-8e30-6d29a0f22242'::uuid) AND (ts > (now() - '12:00:00'::interval)) AND (_st_distance('010100
0020E610000000000000004053C00000000000004340'::geography, geography(geom), 0::double precision, false) < 10000::double precision))
Rows Removed by Filter: 22989
-> Bitmap Index Scan on t_arr_idx (cost=0.00..16.87 rows=115 width=0) (actual time=13.072..13.072 rows=23089 loops=1)
Index Cond: (arr @> $0)
Total runtime: 50.464 ms
它會似乎是位圖堆+位圖索引將是一個最佳的索引解決方案,但我一直在想,如果有,以避免額外的過濾和複查的方式。我可以通過構建替代索引來提高性能嗎?我已經嘗試過:
Indexes:
"t_pkey" PRIMARY KEY, btree (uuid4)
"t_geom_idx" gist (geom)
"t_geom_ts_idx" gist (geom, ts)
"t_geom_ts_uuid4_idx" gist (geom, ts, (uuid4::text))
"t_iam_idx" gin (arr)
"t_ts_geom_idx" gist (ts, geom)
"t_ts_geom_uuid4_idx" gist (ts, geom, (uuid4::text))
"t_ts_uuid4_geom_idx" gist (ts, (uuid4::text), geom)
"t_uuid4_ts_idx" btree (uuid4, ts)
注意k均值爲https://github.com/umitanuki/kmeans-postgresql的延伸。
尼斯查詢。您是否嘗試過使用ST_DWithin而不是ST_Distance_Sphere?它可能會更好地利用空間索引,而不是實際計算所有這些距離。 –
這是票。將您的建議的結果發佈爲答案。謝謝! – Justin