我會考慮使用擴展名pg_trgm
而不是levenshtein()
。如果您使用GiST索引進行備份,它可以使用新的KNN feature in PostgreSQL 9.1,速度要快幾個數量級。
每個數據庫安裝擴展一次:
CREATE EXTENSION pg_trgm;
並利用<->
or %
operator, or the similarity()
function的。有幾個很好的答案已經被張貼在SO已,搜索pg_tgrm [PostgreSQL] ...
狂射你可能會想:
WITH x AS (
SELECT unnest(string_to_array(trim(strip(
'fat:2,4 cat:3 rat:5A'::tsvector)::text, ''''), ''' ''')) AS val
) -- provide ts_vector, extract strings
, y AS(SELECT 'brat'::text AS term) -- provide term to match
SELECT val, term
,(val <-> term) AS trg_dist -- distance operator
,levenshtein(val, term) AS lev_dist
FROM x, y;
返回:
val | term | trg_dist | lev_dist
-----+------+----------+----------
cat | brat | 0.875 | 2
fat | brat | 0.875 | 2
rat | brat | 0.714286 | 1
目前還不清楚是什麼你在之後。添加一個你想要清楚的例子。並添加相關表格的定義。 –