我有一個名爲Description
的列的表。該列填充了文本數據。我想創建一個查詢來返回每個描述中的單詞數量。使用SQL獲取一列的字數
我的想法是創建一個函數,它接受一個值並返回在輸入文本中找到的單詞的數量。
SELECT dbo.GetWordCount(Description) FROM TABLE
例如,如果說明是「世界,你好!有一個愉快的一天。」,查詢應該返回6.
我怎樣才能說明欄中的字數?
我有一個名爲Description
的列的表。該列填充了文本數據。我想創建一個查詢來返回每個描述中的單詞數量。使用SQL獲取一列的字數
我的想法是創建一個函數,它接受一個值並返回在輸入文本中找到的單詞的數量。
SELECT dbo.GetWordCount(Description) FROM TABLE
例如,如果說明是「世界,你好!有一個愉快的一天。」,查詢應該返回6.
我怎樣才能說明欄中的字數?
看到這個建議的解決方案:http://www.sql-server-helper.com/functions/count-words.aspx
CREATE FUNCTION [dbo].[WordCount] (@InputString VARCHAR(4000))
RETURNS INT
AS
BEGIN
DECLARE @Index INT
DECLARE @Char CHAR(1)
DECLARE @PrevChar CHAR(1)
DECLARE @WordCount INT
SET @Index = 1
SET @WordCount = 0
WHILE @Index <= LEN(@InputString)
BEGIN
SET @Char = SUBSTRING(@InputString, @Index, 1)
SET @PrevChar = CASE WHEN @Index = 1 THEN ' '
ELSE SUBSTRING(@InputString, @Index - 1, 1)
END
IF @PrevChar = ' ' AND @Char != ' '
SET @WordCount = @WordCount + 1
SET @Index = @Index + 1
END
RETURN @WordCount
END
GO
用法示例:
DECLARE @String VARCHAR(4000)
SET @String = 'Health Insurance is an insurance against expenses incurred through illness of the insured.'
SELECT [dbo].[WordCount] (@String)
這是一個有點麻煩,但它很好地處理空白的問題,它的快速和內聯,沒有UDF。
DECLARE @Term VARCHAR(100) = ' this is pretty fast '
SELECT @Term, LEN(REPLACE(REPLACE(REPLACE(' '[email protected],' ',' '+CHAR(1)) ,CHAR(1)+' ',''),CHAR(1),'')) - LEN(REPLACE(REPLACE(REPLACE(REPLACE(' '[email protected],' ',' '+CHAR(1)) ,CHAR(1)+' ',''),CHAR(1),''),' ','')) [Word Count]
除了Mortalus的答案我會使用內聯函數,而不是標量(*注 - 這個功能會從SQL Server 2012及後續工作) 爲SQL Server的早期版本見下圖:
/*SQL Server 2012 and up*/
CREATE FUNCTION dbo.udf_WordCount
(
@str VARCHAR(8000)
)
RETURNS TABLE AS RETURN
WITH Tally (n) AS
(
SELECT TOP (LEN(@str)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0)) a(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d(n)
)
, BreakChar as
(
SELECT SUBSTRING(@str , n , 1) [Char] , N
FROM Tally
)
, Analize as
(
SELECT * , lag([Char],1) OVER (ORDER BY N) PrevChar
FROM BreakChar
)
SELECT WordCount = COUNT(1) + 1
FROM Analize
WHERE [Char] != PrevChar
AND PrevChar = ' '
如何使用:
DECLARE @str varchar(1000) = 'It''s now or never I ain''t gonna live forever'
SELECT * FROM dbo.udf_WordCount(@str) --> 9
** SQL Server 2008和更低:
/*SQL Server 2008 and down*/
CREATE FUNCTION dbo.udf_WordCount_2008
(
--declare
@str VARCHAR(8000)
--= 'It''s now or never I ain''t gonna live forever'
)
RETURNS TABLE AS RETURN
WITH Tally (n) AS
(
SELECT TOP (LEN(@str)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0)) a(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d(n)
)
, BreakChar as
(
SELECT SUBSTRING(@str , n , 1) [Char] , N
FROM Tally
)
, Analize as
(
SELECT a.* , b.Char PrevChar
FROM BreakChar a
JOIN BreakChar b
on a.n = b.n+1
)
SELECT WordCount = COUNT(1) + 1
FROM Analize
WHERE [Char] != PrevChar
AND PrevChar = ' '
廣義語法:
SELECT (LENGTH(column_name) - LENGTH(REPLACE(column_name, ' ', ''))),column_name1,column_name2 FROM table_name;
在情況下,如果要計算多少單詞表的單一的 '地址' 欄在那裏名爲「employeeDetails」,那麼:
SELECT (LENGTH(address) - LENGTH(REPLACE(address, ' ', ''))),address,employee_name FROM employeeDetails ;
這個答案是基於Mortalus's answer使用相同的代碼,這是我最初發現here。
該解決方案是該代碼更高效且更簡潔的版本。我還爲代碼添加了一些解釋,希望能夠爲將來的讀者提供更清晰的答案。
以下user defined function取入的文本的字符串,然後通過所輸入的文本的各字符環路。如果前一個字符是空格,則字數增加1。
由於單詞數是通過計算單詞之間的空格來計算的,所以總是比實際單詞少1個空格。要解決此問題,請啓動@PrevChar
,值爲' '
。然後,當循環第一次運行時,代碼到達IF @PrevChar = ' '
時,它將返回true,並且字數將增加1。即使文本長度爲0,這也可以工作,因爲在這種情況下,它不會通過@Index <= LEN(@InputString)
檢查,字數永遠不會增加。 (這取代了鏈接答案中使用的CASE
聲明。)
AND @CurrentChar != ' '
用於解決雙倍間隔計爲多個單詞的問題。如果前一個字符是空格,但當前字符也是空格,請在不增加字數的情況下繼續下一個索引。接下來的迭代將只有@PrevChar
設置爲' '
,所以字數只會增加一倍的雙倍空間。
CREATE FUNCTION [dbo].[WordCount] (@InputString VARCHAR(MAX))
RETURNS INT
AS
BEGIN
DECLARE @Index INT = 1
DECLARE @CurrentChar CHAR(1)
--Initialize the previous character as a space.
DECLARE @PrevChar CHAR(1) = ' '
DECLARE @WordCount INT = 0
WHILE @Index <= LEN(@InputString)
BEGIN
--Set the current character to equal the character in the index
--position of the inputted text.
SET @CurrentChar= SUBSTRING(@InputString, @Index, 1)
--If the previous character was a space and the current character
--is not a space, increase the wordcount by 1.
IF @PrevChar = ' ' AND @CurrentChar != ' '
SET @WordCount = @WordCount + 1
--Increase the index counter by 1.
SET @Index = @Index + 1
--Now that we are done with the current character, set the previous
--character to equal the current character.
SET @PrevChar = @CurrentChar
END
RETURN @WordCount
END