我正在嘗試編寫一個UDF(實際上我正在調整我在網絡中發現的一些代碼到單個函數中)來執行所描述的標題。SQL用戶定義的函數去掉HTML標籤並替換HTML實體
下面的代碼:
declare @txt varchar(max), @start int, @end int, @len int
set @txt = '<p class="answer">Informamos que a documentação <strong>deve ser impressa e enviada fisicamente pela AGÊNCIA</strong>, contendo confere com oringinal por funcionário CAIXA.</p>'
set @start = charindex('<',@txt)
set @end = charindex('>',@txt,@start)
set @len = (@end - @start) + 1
while @start > 0 and @end > 0 and @len > 0
begin
set @txt = stuff(@txt,@start,@len,'')
set @start = charindex('<',@txt)
set @end = charindex('>',@txt,@start)
set @len = (@end - @start) + 1
end
SET @txt = REPLACE(@txt,' ',' ') --space
SET @txt = REPLACE(@txt,'“',CHAR(34)) --"
SET @txt = REPLACE(@txt,'”',CHAR(34)) --"
SET @txt = REPLACE(@txt,'‘',CHAR(39)) --'
SET @txt = REPLACE(@txt,'’',CHAR(39)) --'
SET @txt = REPLACE(@txt,'–',CHAR(150)) -- –
SET @txt = REPLACE(@txt,'—',CHAR(151)) -- —
SET @txt = REPLACE(@txt,'º',CHAR(186)) -- º
SET @txt = REPLACE(@txt,'ª',CHAR(170)) -- ª
SET @txt = REPLACE(@txt,'§',CHAR(167)) -- §
--------------------------------------------------------------
SET @txt = REPLACE(@txt,'"',CHAR(34)) --"
SET @txt = REPLACE(@txt,''',CHAR(39)) --'
--------------------------------------------------------------
SET @txt = REPLACE(@txt,'à','à') --à
SET @txt = REPLACE(@txt,'á','á') --á
SET @txt = REPLACE(@txt,'ã','ã') --ã
SET @txt = REPLACE(@txt,'â','â') --â
SET @txt = REPLACE(@txt,'ä','ä') --ä
SET @txt = REPLACE(@txt,'é','é') --é
SET @txt = REPLACE(@txt,'ê','ê') --ê
SET @txt = REPLACE(@txt,'í','í') --í
SET @txt = REPLACE(@txt,'ó','ó') --ó
SET @txt = REPLACE(@txt,'õ','õ') --õ
SET @txt = REPLACE(@txt,'ø','ø') --ø
SET @txt = REPLACE(@txt,'ú','ú') --ú
SET @txt = REPLACE(@txt,'ü','ü') --ü
SET @txt = REPLACE(@txt,'ç','ç') --ç
--------------------------------------------------------------
SET @txt = REPLACE(@txt,'À',CHAR(192)) --À
SET @txt = REPLACE(@txt,'Á',CHAR(193)) --Á
SET @txt = REPLACE(@txt,'Ã',CHAR(195)) --Ã
SET @txt = REPLACE(@txt,'Â',CHAR(194)) --Â
SET @txt = REPLACE(@txt,'Ä',CHAR(196)) --Ä
SET @txt = REPLACE(@txt,'É',CHAR(201)) --É
SET @txt = REPLACE(@txt,'Ê',CHAR(202)) --Ê
SET @txt = REPLACE(@txt,'Í',CHAR(205)) --Í
SET @txt = REPLACE(@txt,'Ó',CHAR(211)) --Ó
SET @txt = REPLACE(@txt,'Õ',CHAR(213)) --Õ
SET @txt = REPLACE(@txt,'Ø',CHAR(216)) --Ø
SET @txt = REPLACE(@txt,'Ú',CHAR(218)) --Ú
SET @txt = REPLACE(@txt,'Ü',CHAR(220)) --Ü
SET @txt = REPLACE(@txt,'Ç',CHAR(199)) --Ç
select LTRIM(RTRIM(@txt))
它剝離HTML標籤的轉換隻有小寫HTML實體,在單詞查找大寫的時候喜歡Ê
AG Ê
NCIA(通訊社)不工作,印刷通訊社代替。
任何幫助,使其正常工作?
編輯: PS:我不能改變我的數據庫整理,通過@dzomba
試試這個http://stackoverflow.com/questions/457701/best-way-to-strip-html-tags-from-a-string-in-sql-server – Sushil