PostgreSQL中的文本n-gram

我正在尋找從PostgreSQL中的文本列創建n-gram。我目前將文本列中的數據（句子）分割爲數組（空格）。PostgreSQL中的文本n-gram

enter code here選擇regexp_split_to_array（sentenceData，E '\ S +'）從表名

一旦我有了這個數組，我怎麼去：

創建循環找正克，在另一個表中將每行寫入一行

使用unnest我可以獲得單獨行上所有數組的所有元素，也許我可以想出一種方法從單個列中獲取n-gram，但我會放棄這句話我明智地保存的日記。 PostgreSQL的

示例SQL代碼仿效上述場景

create table tableName(sentenceData text); 

INSERT INTO tableName(sentenceData) VALUES('This is a long sentence'); 

INSERT INTO tableName(sentenceData) VALUES('I am currently doing grammar, hitting this monster book btw!'); 

INSERT INTO tableName(sentenceData) VALUES('Just tonnes of grammar, problem is I bought it in TAIWAN, and so there aint any englihs, just chinese and japanese'); 

select regexp_split_to_array(sentenceData,E'\\s+') from tableName; 

select unnest(regexp_split_to_array(sentenceData,E'\\s+')) from tableName;

來源

2010-06-15 harshsinghal

退房pg_trgm道：「pg_trgm模塊提供的函數和操作用於基於三字母組的匹配，以及索引操作符類的文本的相似性支持快速搜索類似的字符串。「

來源

2010-06-15 15:42:02

PostgreSQL中的文本n-gram

回答

相關問題