tsql子串或字符串操作

我有一個nvarchar（最大）列，我需要提取開放href標籤和關閉href標籤之間的一切。例如，如果我的專欄的內容，其中以下幾點：tsql子串或字符串操作

Here you can visit <a href="http://www.thisite.com">this link</a> or this 
<a href="http://www.newsite.com">new link</a>. this is just a test to find the right answer.

然後我我查詢的結果應該是：

"<a href="http://www.thisite.com">this link</a>" 
"<a href="http://www.newsite.com">new link</a>"

任何幫助，將不勝感激！

來源

2011-09-30 james

你必須使用CLR用戶定義函數（在SQL Server 2005 +支持）：

Regular Expressions Make Pattern Matching And Data Extraction Easier

來源

2011-09-30 06:27:10

我同意CLR的使用，但正則表達式不能用*可靠*來解析html。對於一個很好的閱讀，看看[你不能解析HTML與正則表達式]（http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-標籤）... *

不能容納* –

* HTML是一種足夠複雜的語言，它不能被正則表達式解析。即使Jon Skeet也不能使用正則表達式解析HTML * –

我同意，但作者不需要**解析HTML **。由於問題很簡單，「在開啓href標籤和關閉href標籤之間提取所有內容」--KISS原則在這裏應該工作得很好。 –

同意，CLR解決方案應該會更快。 更多，我不認爲SQL Server應該完成這項任務。您可以編寫客戶端應用程序（VB.NET，C＃等）或應該完成此任務的PowerShell腳本。

如果你想要一個T-SQL唯一的解決辦法（請閱讀上面的段落，再次），然後看看這個查詢（至少SQL Server 2005中）：

CREATE TABLE dbo.TestData 
(
    ID INT IDENTITY(1,1) PRIMARY KEY 
    ,SomeText NVARCHAR(MAX) NOT NULL 
); 
INSERT dbo.TestData 
SELECT 'Here you can visit <a href="http://www.thisite.com">this link</a> or this <a href="http://www.newsite.com">new link</a>' 
UNION ALL 
SELECT '<div class="tagged"> 
<a href="https://stackoverflow.com/questions/tagged/string" class="post-tag">string</a>&nbsp; 
    <span class="item-multiplier">&times;&nbsp;16364</span><br> 
<a href="https://stackoverflow.com/questions/tagged/tsql" class="post-tag">tsql</a>&nbsp; 
    <span class="item-multiplier">&times;&nbsp;10304</span><br> 
<a href="https://stackoverflow.com/questions/tagged/substring" class="post-tag">substring</a><acronym title="as soon as possible">ASAP</acronym>'; 

WITH ParseAnchorTags 
AS 
(
SELECT a.ID 
     ,SUBSTRING(a.SomeText, CHARINDEX('<a ',a.SomeText), CHARINDEX('</a>',a.SomeText)-CHARINDEX('<a ',a.SomeText)+4) AS Txt 
     ,CHARINDEX('</a>',a.SomeText)+3 AS LastIndex 
FROM dbo.TestData a 
UNION ALL 
SELECT a.ID 
     ,SUBSTRING(a.SomeText, CHARINDEX('<a ',a.SomeText,prev.LastIndex+1), CHARINDEX('</a>',a.SomeText,prev.LastIndex+1)-CHARINDEX('<a ',a.SomeText,prev.LastIndex+1)+4) AS Txt 
     ,CHARINDEX('</a>',a.SomeText,prev.LastIndex+1)+3 AS LastIndex 
FROM dbo.TestData a 
INNER JOIN ParseAnchorTags prev ON a.ID=prev.ID 
AND  CHARINDEX('<a ',a.SomeText,prev.LastIndex+1) > 0 
) 
SELECT * 
FROM ParseAnchorTags cte 
ORDER BY cte.ID, cte.LastIndex; 

DROP TABLE dbo.TestData;

結果：

ID   Txt 
----------- -------------------------------------------------------------------- 
1   <a href="http://www.thisite.com">this link</a> 
1   <a href="http://www.newsite.com">new link</a> 
2   <a href="https://stackoverflow.com/questions/tagged/string" class="post-tag">string</a> 
2   <a href="https://stackoverflow.com/questions/tagged/tsql" class="post-tag">tsql</a> 
2   <a href="https://stackoverflow.com/questions/tagged/substring" class="post-tag">substring</a>

來源

2011-09-30 07:19:51

declare @a varchar(max) = 'Here you can visit <a href="http://www.thisite.com">this link</a> or this <a href="http://www.newsite.com">new link</a>. this is just a test to find the right answer. ' 

;with cte as 
(
select cast(1 as bigint) f, cast(1 as bigint) t 
union all 
select charindex('<a href=', @a, t), charindex('</a>', @a, charindex('<a href=', @a, t)) 
from cte where charindex('<a href=', @a, t) > 0 
) 
select substring(@a, f, t-f)+'</a>' from cte 
where t > 1

來源

2011-09-30 07:27:56

tsql子串或字符串操作

回答

相關問題