我有像'John is my name; Ram is my name; Adam is my name'
的數據。SQL Server:在特定字符後檢查大寫或小寫
我的規則是這樣的,每個在;
之後的第一個字母應該是大寫字母。
如何選擇所有符合規則的值?
我有像'John is my name; Ram is my name; Adam is my name'
的數據。SQL Server:在特定字符後檢查大寫或小寫
我的規則是這樣的,每個在;
之後的第一個字母應該是大寫字母。
如何選擇所有符合規則的值?
的其他答案顯示如何將行轉換爲與您的模式相匹配的內容。
如果你只是想select
符合您所描述的模式的行,你可以使用patindex()
或like
使用區分大小寫的排序規則(或使用collate
申請一個)。
這裏假定除了規則之外,每個分號後面的字母必須是大寫字母,第一個字母也應該是大寫字母。如果不是這種情況,只需刪除where
中的第一個子句即可。
select *
from t
where patindex('[ABCDEFGHIJKLMNOPQRSTUVWXYZ]%', val collate latin1_general_cs_as) = 1
and patindex('%; [^ABCDEFGHIJKLMNOPQRSTUVWXYZ]%', val collate latin1_general_cs_as) = 0
測試設置:
create table t (id int not null identity(1,1),val varchar(256))
insert into t values
('John is my name; Ram is my name; Adam is my name')
,('john is my name; ram is my name; adam is my name')
rextester演示:http://rextester.com/DBGIS10645
上述兩種返回的:
+----+--------------------------------------------------+
| id | val |
+----+--------------------------------------------------+
| 1 | John is my name; Ram is my name; Adam is my name |
+----+--------------------------------------------------+
你可能會與XML招這樣
DECLARE @YourString VARCHAR(100)='John is my name; Ram is my name; Adam is my name';
WITH Splitted AS
(
SELECT CAST('<x>' + REPLACE((SELECT REPLACE(@YourString,'; ','$$SplitHere$$') AS [*] FOR XML PATH('')),'$$SplitHere$$','</x><x>')+ '</x>' AS XML) AS Casted
)
,DerivedTable AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS PartNr
,x.value(N'text()[1]',N'nvarchar(max)') AS Part
FROM Splitted
CROSS APPLY Casted.nodes(N'/x') AS X(x)
)
SELECT PartNr
,Part
,CASE WHEN ASCII(LEFT(Part,1)) BETWEEN ASCII('A') AND ASCII('Z') THEN 1 ELSE 0 END AS FirstIsCapital
FROM DerivedTable;
Nr Part FirstLetterIsCaptial
----------------------------------------
1 John is my name 1
2 Ram is my name 1
3 Adam is my name 1
我不知道你的最終目標是什麼...找份,其中第一個字母,結果這個分裂不是資本?確保你的規則滿員?
但是:
最好的是,以此來糾正你的設計,並將這些部件在1:n
相關的邊桌。
使用以下源字符串進行測試:''John是我的名字;拉姆是我的名字;亞當是我的名字' –
@BogdanSahlean,那麼我會使用'L/RTRIM()'...錯誤是存儲格式...解決這個問題的任何代碼將是一個黑客... – Shnugo
'TRIM'是在SQL Server 2017中引入的。如果源字符串是''!John是我的名字;!Ram是我的名字;!Adam是我的名字;'Hola!'? –
醜陋的解決方案的點點,但你可以給一個嘗試...
Declare @str nvarchar(max) = 'John is my name; Ram is my name; Adam is my name'
Declare @xml as xml
Set @xml = cast(('<X>'+replace(@str,';' ,'</X><X>')+'</X>') as xml)
Select * from (
Select RowN = Row_Number() over (order by (SELECT NULL)), LTrim(RTrim(N.value('.', 'nvarchar(MAX)'))) as value FROM @xml.nodes('X') as T(N) -- this is to split if you are using sql server 2016 you can use string_Split
) a
Where unicode(substring(a.[value],1,1)) = unicode(upper(substring(a.[value],1,1)))
想法是分割字符串與Unicode值檢查,看它是否是上還是不
注意:標準的做法是在應用程序中使用C#/ VB [.Net]執行此操作。
[1]解決方案:
DECLARE @Source NVARCHAR(100) = N'john is my name; Ram is my name; adam is my name'
SELECT z.Sentence
FROM (VALUES (CONVERT(XML, N'<root><i>' + REPLACE(@Source, N';', N'</i><i>;') + N'</i></root>'))) AS x(XmlCol)
CROSS APPLY x.XmlCol.nodes(N'/root/i') AS y(XmlCol)
CROSS APPLY (VALUES(y.XmlCol.value('(text())[1]', 'NVARCHAR(100)'))) AS z(Sentence)
WHERE SUBSTRING(z.Sentence, NULLIF(PATINDEX('%[a-z]%', z.Sentence), 0), 1) LIKE '%[a-z]%' COLLATE Latin1_General_BIN
ORDER BY ROW_NUMBER() OVER(ORDER BY y.XmlCol)
在這種情況下,結果將是
john is my name
; adam is my name
[2]如果你正在試圖利用第一個字母從每一個一句然後我會用下列溶液(見註釋廣告行結束):
DECLARE @Source NVARCHAR(100) = N'john is my name; ram is my name; adam is my name'
SELECT (
SELECT u.NewSentence AS '*'
FROM (VALUES (CONVERT(XML, N'<root><i>' + REPLACE(@Source, N';', N';</i><i>') + N'</i></root>'))) AS x(XmlCol) -- It convert source string into XML. Every ; acct as a delimiter for sentence. End results will be like this <root><i>john...;</i><i> ram ....</i>...</root>
CROSS APPLY x.XmlCol.nodes(N'/root/i') AS y(XmlCol) -- It decompose original XML into separate sentences as XML
CROSS APPLY (VALUES(y.XmlCol.value('(text())[1]', 'NVARCHAR(100)'))) AS z(Sentence) -- ... AS NVARCHAR(100)
CROSS APPLY (VALUES(PATINDEX('%[a-z]%', z.Sentence))) AS t(FirstLetterIndex) -- It finds index of first letter
CROSS APPLY (VALUES(IIF(t.FirstLetterIndex > 0, STUFF(z.Sentence, t.FirstLetterIndex, 1, UPPER(SUBSTRING(z.Sentence, t.FirstLetterIndex, 1))), z.Sentence))) AS u(NewSentence) -- It replace every first letter with the capitalized version/UPPER(...)
ORDER BY ROW_NUMBER() OVER(ORDER BY y.XmlCol) -- All sentences should be ordered by original position within source string
FOR XML PATH('') -- It concatenates all sentences back in one string
)
例如,如果源串是N'john is my name; ram is my name; adam is my name'
那麼結果將是N'John is my name; Ram is my name; Adam is my name'
。
注:該解決方案的工作(以及基於XML切碎所有其他解決方案)如果源字符串不包括一些XML字符保留(如<
)。讓我知道如果這是你的情況。
只需注意:我的XML碎片沒有問題,保留字符...用'PATINDEX'('%[az]%')搜索第一個字母似乎過於複雜... – Shnugo
我相信FOR XML ...增加一些開銷。我提到如果源字符串包含這樣的字符,OP應該讓我知道。沒有提到每個句子的第一個字母應該是一個字母。它可能是一個空間,也可能是第一封信之前可能是100個空格。 –
你可以創建一個這樣的功能。
Create FUNCTION SPLITTER (
@textData NVARCHAR(MAX),
@Delimeter NVARCHAR(MAX)) RETURNS @RtnValue TABLE (
Data NVARCHAR(MAX)) AS BEGIN
DECLARE @index INT DECLARE @data nvarchar(1000) DECLARE @firstCharacter char
SET @index = CHARINDEX(@Delimeter,@textData)
WHILE (@index>0)
BEGIN
set @data = LTRIM(RTRIM(SUBSTRING(@textData, 1, @index - 1))) set @firstCharacter = SUBSTRING(@data,1,1);
if UNICODE(@firstCharacter) = UNICODE(upper(@firstCharacter)) begin INSERT INTO @RtnValue (data) SELECT @data end;
SET @textData = SUBSTRING(@textData, @index + DATALENGTH(@Delimeter)/2, LEN(@textData))
SET @index = CHARINDEX(@Delimeter, @textData)
END
set @data = @textData set @firstCharacter = SUBSTRING(@data,1,1);
if UNICODE(@firstCharacter) = UNICODE(upper(@firstCharacter)) begin INSERT INTO @RtnValue (data) SELECT @data end;
RETURN END
使用這樣
SELECT * FROM分路器( '約翰是我的名字,拉姆是我的名字;亞當是我的名字', ';')
你可以抓住的NGrams8K副本,並做到這一點:
-- note that I made the 3rd item start with lower-case
DECLARE @YourString VARCHAR(100)='John is my name; Ram is my name; adam is my name';
WITH D(n) AS
(
SELECT 0 UNION ALL SELECT position
FROM dbo.NGrams8k(@yourstring,1) WHERE token = ';'
),
TOKEN(token) AS
(
SELECT LTRIM(SUBSTRING(@YourString, N+1,
ISNULL(NULLIF(CHARINDEX(';', @YourString, N+1),0), 101)-(N+1)))
FROM D
)
SELECT token,
FirstLetterIsCaptial = IIF(ASCII(SUBSTRING(token,1,1)) BETWEEN 65 AND 90, 1, 0)
FROM TOKEN;
結果
token FirstLetterIsCaptial
------------------ --------------------
John is my name 1
Ram is my name 1
adam is my name 0
的SQL Server的哪個版本? – Shnugo
@Shnugo Microsoft SQL Server 2012 - 11.0.5058.0(X64) –
這將是一個醜陋的問題,特別是如果分號分隔的術語數量未知。更好的解決方案是將數據標準化並將每個名稱/句子放在單獨的記錄中。 –