2010-09-30 56 views

回答

1

我仍然不知道是什麼的是N-gram但基於Ed的回答是這樣的你需要?

declare @string varchar(max) = 'hello' 
declare @n int = 3 

set @string = @string + REPLICATE('-',@n - (len(@string) % @n)) 

;with n as 
(
SELECT 1 AS i 
UNION ALL 
SELECT i+1 
FROM n 
WHERE i <= (LEN(@string)[email protected]) 
) 
select SUBSTRING(@string, i, @n), COUNT(*) 
from n 
group by SUBSTRING(@string, i, @n) 
option (maxrecursion 0) 
+0

lo-必須是輸出的一部分 – jozi 2010-09-30 16:05:47

+0

我的問題是「爲什麼」?即你能否更好地向不知道什麼是卦的人解釋它。 – 2010-09-30 16:07:34

+0

http://en.wikipedia.org/wiki/Trigram – jozi 2010-09-30 16:12:14

2

根據馬丁·史密斯的答案 - 由埃德和馬丁3

declare @string varchar(max) = 'hello' 

SET @string = (SELECT CASE LEN(@string) % 3 
          WHEN 1 THEN @string + '--' 
          WHEN 2 THEN @string + '-' 
          ELSE @string 
         END) 
;with n as 
(
SELECT 1 AS i 
UNION ALL 
SELECT i+1 
FROM n 
WHERE i < (LEN(@string)-2) 
) 
select SUBSTRING(@string, i, 3) AS Trigram, COUNT(*) AS Count 
from n 
group by SUBSTRING(@string, i, 3) 
option (maxrecursion 0) 
3

借款增加邏輯墊串出與-到數整除的字符,我認爲這是一個正確的實現:

declare @string varchar(max) = 'here kitty kitty' 

SET @string = replace(@string, ' ', '-') --Wikipedia says this should be underscore, not dash 
;with n as 
( 
    SELECT 1 AS i 
    UNION ALL 
    SELECT i + 1 
    FROM n 
    WHERE i < (LEN(@string)-2) 
) 
select SUBSTRING(@string, i, 3) AS Trigram, COUNT(*) AS Count 
from n 
group by SUBSTRING(@string, i, 3) 
option (maxrecursion 0) 
+0

我不知道這是否正確,但+1。你讀過維基百科的文章了嗎?編輯:我看到你做到了! – 2010-09-30 16:30:06

+1

我閱讀了簡潔的三字母頁面(http://en.wikipedia.org/wiki/Trigram)。該算法提供與該頁面相同的輸出。 – RedFilter 2010-09-30 16:33:24

相關問題