2017-08-26 77 views
1

我有一個具有複合鍵和逗號分隔值的表。我需要爲每個逗號分隔的元素將單行分割成一行。我看到過類似的問題和類似的答案,但一直未能將它們轉化爲我自己的解決方案。在逗號分隔值的Sql查詢連接

我正在運行SQL Server 2008 R2。

| Key Part 1 | Key Part 2 | Key Part 3 | Values  | 
|------------------------------------------------------| 
| A   | A   | A   | PDE,PPP,POR | 
| A   | A   | B   | PDE,XYZ  | 
| A   | B   | A   | PDE,RRR  | 
|------------------------------------------------------| 

,我需要這個作爲輸出

| Key Part 1 | Key Part 2 | Key Part 3 | Values  | Sequence | 
|-------------------------------------------------------------------| 
| A   | A   | A   | PDE   | 0   | 
| A   | A   | A   | PPP   | 1   | 
| A   | A   | A   | POR   | 2   | 
| A   | A   | B   | PDE   | 0   | 
| A   | A   | B   | XYZ   | 1   | 
| A   | B   | A   | PDE   | 0   | 
| A   | B   | A   | RRR   | 1   | 
|-------------------------------------------------------------------| 

感謝

傑夫

+0

是否所有的元素恰好3個字符每個,因爲它們在你例子? –

+1

您應該**不**在單個單元格中以逗號分隔形式存儲多個值。首先,正如你所看到的,後來使用這些數據是一團糟和艱苦的工作,而且它也違反了**關係數據庫設計的第一種常規形式。 –

回答

4

下面是一個簡單的內聯的方式,如果你沒有或者不想拆分/解析UDF

示例

Select A.[Key Part 1] 
     ,A.[Key Part 2] 
     ,A.[Key Part 3] 
     ,B.* 
From YourTable A 
Cross Apply (
       Select [Values] = LTrim(RTrim(X2.i.value('(./text())[1]', 'varchar(max)'))) 
         ,[Sequence] = Row_Number() over (Order By (Select null))-1 
       From (Select x = Cast('<x>' + replace(A.[Values],',','</x><x>')+'</x>' as xml)) X1 
       Cross Apply x.nodes('x') X2(i) 
      ) B 

返回

enter image description here

編輯 - 如果打開的表值函數

查詢應該是這樣的

Select A.[Key Part 1] 
     ,A.[Key Part 2] 
     ,A.[Key Part 3] 
     ,[Values] = B.RetVal 
     ,[Sequence] = B.RetSeq-1 
From @YourTable A 
Cross Apply [dbo].[udf-Str-Parse-8K](A.[Values],',') B 

的UDF如果有興趣

CREATE FUNCTION [dbo].[udf-Str-Parse-8K] (@String varchar(max),@Delimiter varchar(25)) 
Returns Table 
As 
Return ( 
    with cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)), 
      cte2(N) As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 a,cte1 b,cte1 c,cte1 d) A), 
      cte3(N) As (Select 1 Union All Select t.N+DataLength(@Delimiter) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter)) = @Delimiter), 
      cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter,@String,s.N),0)-S.N,8000) From cte3 S) 

    Select RetSeq = Row_Number() over (Order By A.N) 
      ,RetVal = LTrim(RTrim(Substring(@String, A.N, A.L))) 
    From cte4 A 
); 
--Orginal Source http://www.sqlservercentral.com/articles/Tally+Table/72993/ 
--Select * from [dbo].[udf-Str-Parse-8K]('Dog,Cat,House,Car',',') 
--Select * from [dbo].[udf-Str-Parse-8K]('John||Cappelletti||was||here','||') 
0

如果所有CSV值恰好3個字符(如你在測試數據有),你可以通過創建使用一款符合表中一個非常有效的方式需要預先確定的行數(與爲每個字符創建一行以查找分隔符字符相反)...因爲您已經知道分隔符位置。

在這種情況下,我將使用一個理貨函數,但您也可以使用一個固定理貨表。

代碼爲tfn_Tally功能...

SET QUOTED_IDENTIFIER ON 
SET ANSI_NULLS ON 
GO 
CREATE FUNCTION dbo.tfn_Tally 
/* ============================================================================ 
07/20/2017 JL, Created. Capable of creating a sequense of rows 
       ranging from -10,000,000,000,000,000 to 10,000,000,000,000,000 
============================================================================ */ 
(
    @NumOfRows BIGINT, 
    @StartWith BIGINT 
) 
RETURNS TABLE WITH SCHEMABINDING AS 
RETURN 
    WITH 
     cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)), -- 10 rows 
     cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),        -- 100 rows 
     cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n2 b),        -- 10,000 rows 
     cte_n4 (n) AS (SELECT 1 FROM cte_n3 a CROSS JOIN cte_n3 b),        -- 100,000,000 rows 
     cte_Tally (n) AS (
      SELECT TOP (@NumOfRows) 
       (ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1) + @StartWith 
      FROM 
       cte_n4 a CROSS JOIN cte_n4 b             -- 10,000,000,000,000,000 rows 
      ) 
    SELECT 
     t.n 
    FROM 
     cte_Tally t; 
GO 

如何解決使用它...

-- create some test data... 
IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL 
DROP TABLE #TestData; 

CREATE TABLE #TestData (
    KeyPart1 CHAR(1), 
    KeyPart2 CHAR(1), 
    KeyPart3 CHAR(1), 
    [Values] varchar(50) 
    ); 

INSERT #TestData (KeyPart1, KeyPart2, KeyPart3, [Values]) VALUES 
    ('A', 'A', 'A', 'PDE,PPP,POR'), 
    ('A', 'A', 'B', 'PDE,XYZ'), 
    ('A', 'B', 'A', 'PDE,RRR,XXX,YYY,ZZZ,AAA,BBB,CCC'); 

--========================================================== 

-- solution query... 
SELECT 
    td.KeyPart1, 
    td.KeyPart2, 
    td.KeyPart3, 
    x.SplitValue, 
    [Sequence] = t.n 
FROM 
    #TestData td 
    CROSS APPLY dbo.tfn_Tally(LEN(td.[Values]) - LEN(REPLACE(td.[Values], ',', '')) + 1, 0) t 
    CROSS APPLY (VALUES (SUBSTRING(td.[Values], t.n * 4 + 1, 3))) x (SplitValue); 

而且結果...

KeyPart1 KeyPart2 KeyPart3 SplitValue Sequence 
-------- -------- -------- ---------- -------------------- 
A  A  A  PDE  0 
A  A  A  PPP  1 
A  A  A  POR  2 
A  A  B  PDE  0 
A  A  B  XYZ  1 
A  B  A  PDE  0 
A  B  A  RRR  1 
A  B  A  XXX  2 
A  B  A  YYY  3 
A  B  A  ZZZ  4 
A  B  A  AAA  5 
A  B  A  BBB  6 
A  B  A  CCC  7 

如果假設所有csv元素都是字符數是不正確的,你最好使用傳統的基於tally的分離器。在這種情況下,我的建議是DelimitedSplit8K written by Jeff Moden

在這種情況下,解決方案查詢將如下所示...

SELECT 
    td.KeyPart1, 
    td.KeyPart2, 
    td.KeyPart3, 
    SplitValue = dsk.Item, 
    [Sequence] = dsk.ItemNumber - 1 
FROM 
    #TestData td 
    CROSS APPLY dbo.DelimitedSplit8K(td.[Values], ',') dsk; 

安結果...

KeyPart1 KeyPart2 KeyPart3 SplitValue Sequence 
-------- -------- -------- ---------- -------------------- 
A  A  A  PDE  0 
A  A  A  PPP  1 
A  A  A  POR  2 
A  A  B  PDE  0 
A  A  B  XYZ  1 
A  B  A  PDE  0 
A  B  A  RRR  1 
A  B  A  XXX  2 
A  B  A  YYY  3 
A  B  A  ZZZ  4 
A  B  A  AAA  5 
A  B  A  BBB  6 
A  B  A  CCC  7 

HTH,傑森

0

- 創建表

Create table YourTable 
(
p1 varchar(50), 
p2 varchar(50), 
p3 varchar(50), 
pval varchar(50) 
) 
go 

- 插入數據

insert into YourTable values ('A','A','A','PDE,PPP,POR'), 
('A','A','B','PDE,XYZ'),('A','B','A','PDE,RRR') 

    go 

- 查看樣本數據

SELECT p1, p2, p3 , pval FROM YourTable 
go 

- 所需的結果

SELECT p1,p2,p3, LTRIM(RTRIM(Split.a.value('.', 'VARCHAR(100)'))) as Value1 , ROW_NUMBER() OVER(PARTITION BY id ORDER BY id ASC)-1 AS SequenceNo 
FROM 
(SELECT ROW_NUMBER() over (order by (SELECT NULL)) AS ID, p1,p2,p3, pval, CAST ('<M>' + REPLACE(pval, ',', '</M><M>') + '</M>' AS XML) AS Data from YourTable 
) AS A 
CROSS APPLY Data.nodes ('/M') AS Split(a) 
go 

- 刪除臨時創建的表

drop table YourTable 
go