2015-11-02 63 views
1

我有隻包含零或1的字符串(VARCHAR(255))。
我需要搜索所有位置並將它們返回爲逗號分隔的字符串。 我從https://dba.stackexchange.com/questions/41961/how-to-find-all-positions-of-a-string-within-another-string搜索字符串中char的所有位置,並返回爲逗號分隔的字符串

這裏是我到目前爲止的代碼生成使用的解決方案兩個查詢:

DECLARE @TERM VARCHAR(5); 
SET @TERM = '1'; 
DECLARE @STRING VARCHAR(255); 
SET @STRING = '101011011000000000000000000000000000000000000000'; 

DECLARE @RESULT VARCHAR(100); 
SET @RESULT = ''; 

SELECT 
    @RESULT = @RESULT + CAST(X.pos AS VARCHAR(10)) + ',' 
FROM 
    (SELECT 
     pos = Number - LEN(@TERM) 
    FROM 
     (SELECT 
     Number 
     ,Item = LTRIM(RTRIM(SUBSTRING(@STRING, Number, CHARINDEX(@TERM, @STRING + @TERM, Number) - Number))) 
     FROM 
     (SELECT ROW_NUMBER() OVER (ORDER BY [object_id]) FROM sys.all_objects 
     ) AS n (Number) 
     WHERE 
     Number > 1 
     AND Number <= CONVERT(INT, LEN(@STRING)) 
     AND SUBSTRING(@TERM + @STRING, Number, LEN(@TERM)) = @TERM 
    ) AS y 
    ) X; 

SELECT 
    SUBSTRING(@RESULT, 0, LEN(@RESULT)); 



DECLARE @POS INT; 
DECLARE @OLD_POS INT; 
DECLARE @POSITIONS VARCHAR(100); 
SELECT 
    @POSITIONS = ''; 
SELECT 
    @OLD_POS = 0; 
SELECT 
    @POS = PATINDEX('%1%', @STRING); 
WHILE @POS > 0 
    AND @OLD_POS <> @POS 
    BEGIN 
     SELECT 
     @POSITIONS = @POSITIONS + CAST(@POS AS VARCHAR(2)) + ','; 
     SELECT 
     @OLD_POS = @POS; 
     SELECT 
     @POS = PATINDEX('%1%', SUBSTRING(@STRING, @POS + 1, LEN(@STRING))) + @POS; 
    END; 
SELECT 
    LEFT(@POSITIONS, LEN(@POSITIONS) - 1); 

我想知道如果這是可以做到更快/更好?我只搜索單個字符位置,並且只有兩個字符可以出現在我的字符串(0和1)中。

我已經使用這段代碼構建了兩個函數,並且爲它們運行了1000條記錄,並且在同一時間得到了相同的結果,所以我不知道哪一個更好。

對於單個記錄的第二部分給CPU和讀取等於0在Profiler中,其中第一段代碼給我CPU = 16和讀取= 17。

我需要得到如下結果:1,3,5,6,8,9(多次出現時),3單次出現,NONE如果沒有出現。

+0

這個_have_要在SQL中完成嗎?識別所有'1'位置的目的是什麼? –

+0

理想情況下,更改所需的輸出以及可能的存儲設計。如果您需要處理多個值,逗號分隔的值填充到字符串中應該是* last *手段。 SQL Server具有用於保存多個值的* * *類型,例如* tables *和XML。 –

+0

@DStanley我有表中有選項。有100列。這是一個非常古老的數據庫。對於行,只能有單個值,例如對於行爲單個「1」,但對於舊行,則存在錯誤。我正在創建搜索這些行並列出它們的報告。我已將值轉換爲01字符串,現在我需要知道具有選項號的位置。 – Misiu

回答

3

一些tally表和xml解決方案:

DECLARE @STRING NVARCHAR(100) = '101011011000000000000000000000000000000000000000'; 

;with cte as(select ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) p 
      from (values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t1(n) cross join 
        (values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t2(n) cross join 
        (values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t3(n)) 
SELECT STUFF((SELECT ',' + CAST(p AS VARCHAR(100)) 
       FROM cte 
       WHERE p <= LEN(@STRING) AND SUBSTRING(@STRING, p, 1) = '1' 
       FOR XML PATH('')), 1, 1, '') 

您剛纔生成的編號從1到1000(添加更多的連接,如果一個字符串的長度更大),並與substring功能過濾器需要的值。然後將行連接到逗號分隔值的標準技巧。

對於舊版本:

;with cte as(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) p 
      FROM sys.all_columns a CROSS JOIN sys.all_columns b) 
SELECT STUFF((SELECT ',' + CAST(p AS VARCHAR(100)) 
       FROM cte 
       WHERE p <= LEN(@STRING) AND SUBSTRING(@STRING, p, 1) = '1' 
       FOR XML PATH('')), 1, 1, '') 

這裏是產生範圍http://dwaincsql.com/2014/03/27/tally-tables-in-t-sql/好文章

編輯:

;with cte as(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) p 
       FROM (SELECT 1 AS rn UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1) t1 CROSS JOIN 
       (SELECT 1 AS rn UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1) t2 CROSS JOIN 
       (SELECT 1 AS rn UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1) t3 CROSS JOIN 
       (SELECT 1 AS rn UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1) t4 CROSS JOIN 
       (SELECT 1 AS rn UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1) t5 CROSS JOIN 
       (SELECT 1 AS rn UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1) t6) 
+0

謝謝你這麼快的回覆,但我無法得到它在SQL Server 2005上工作。我得到錯誤:'關鍵字'值'附近的語法不正確。' – Misiu

+0

@Misiu,我已編輯答案 –

+0

現在它工作正常,謝謝。你能用幾句話來描述爲什麼這比我的兩種解決方案更好/更快?所有三個作品,但我不能說哪一個更好 – Misiu

1

的Giorgi的反應是很聰明,但我更喜歡一個更老式的方法,更具可讀性。我的建議,包括測試用例:

if object_id('UFN_CSVPOSITIONS') is not null 
begin 
    drop function ufn_csvpositions; 
end 
go 

create function dbo.UFN_CSVPOSITIONS 
(
    @string nvarchar(255) 
,@delimiter nvarchar(1) = ',' 
) 
returns nvarchar(255) 
as 
begin 
    --given a string that contains ones, 
    --return a comma-delimited list of the positions of those ones 
    --example: '1001' returns '1,4' 
    declare @result nvarchar(255) = ''; 
    declare @i int = 1; 
    declare @slen int = len(@string); 
    declare @idx int = 0; 

    while @i < @slen 
    begin 
    set @idx = charindex('1',@string,@i); 
    if 0 = @idx 
    begin 
     set @i = @slen; --no more to be found, break out early 
    end 
    else 
    begin 
     set @result = @result + @delimiter + convert(nvarchar(3),@idx); 
     set @i = @idx; --jump ahead 
    end; 
    set @i = @i + 1; 
    end --while 

    if (0 < len(@result)) and (',' = substring(@result,1,1)) 
    begin 
    set @result = substring(@result,2,len(@result)-1) 
    end 

    return @result; 
end 
go 

--test cases 
DECLARE @STRING NVARCHAR(255) = ''; 
set @string = '101011011000000000000000000000000000000000000000'; 
print dbo.UFN_CSVPOSITIONS(@string,','); 
set @string = null; 
print dbo.UFN_CSVPOSITIONS(@string,','); 
set @string = ''; 
print dbo.UFN_CSVPOSITIONS(@string,','); 
set @string = '1111111111111111111111111111111111111111111111111'; 
print dbo.UFN_CSVPOSITIONS(@string,','); 
set @string = '0000000000000000000000000000000000000000000000000'; 
print dbo.UFN_CSVPOSITIONS(@string,','); 

--lets try a very large # of test cases, see how fast it comes out 
--255 "ones" should be the worst case scenario for performance, so lets run through 50k of those. 
--on my laptop, here are test case results: 
--all 1s : 13 seconds 
--all 0s : 7 seconds 
--all nulls: 1 second 
declare @testinput nvarchar(255) = replicate('1',255); 
declare @iterations int = 50000; 
declare @i int = 0; 
while @i < @iterations 
begin 
    print dbo.ufn_csvpositions(@testinput,','); 
    set @i = @i + 1; 
end; 

--repeat the test using the CTE method. 
--the same test cases are as follows on my local: 
--all 1s : 18 seconds 
--all 0s : 15 seconds 
--all NULLs: 1 second 
set nocount on; 
set @i = 0; 
set @iterations = 50000; 
declare @result nvarchar(255) = ''; 
set @testinput = replicate('1',255); 
while @i < @iterations 
begin 
    ;with cte as(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) p 
       FROM sys.all_columns a CROSS JOIN sys.all_columns b) 
    SELECT @result = STUFF((SELECT ',' + CAST(p AS VARCHAR(100)) 
       FROM cte 
       WHERE p <= LEN(@testinput) AND SUBSTRING(@testinput, p, 1) = '1' 
       FOR XML PATH('')), 1, 1, '') 
    print @result; 
    set @i = @i + 1; 
end; 
+0

我有使用while循環的解決方案,但感謝您的回答和時間花費。我正在尋找這樣做的最快方法。你的代碼更容易閱讀,但我認爲@ giorgi-nakeuri代碼要快得多。對不起,如果我錯了,但我是新手,如果涉及到SQL查詢優化。 – Misiu

+0

當我比較兩種方法的性能時,WHILE循環更快。我已經編輯了包含這兩種方法的測試用例的答案;在你的機器上試試看看你的體驗是否一樣。 – JosephStyons

+0

我必須確認使用WHILE的版本要快一些。我已經測試了140k行和CPU,並且WHILE方法的讀取更小。我必須在較大的數據集上進行測試。 – Misiu