2013-12-11 87 views
1

我是SSIS的新手,需要一些幫助來弄清楚如何解析這些數據。課程級學習目標需要分成多行,並且需要將[]中的數據移動到它自己的列中。任何幫助將不勝感激。 CSV文件包含多個記錄。下面的例子只是一個記錄。在CSV文件中將數據從一行解析到多行

當前CSV格式文件

Prefix/Code,Name,Credits,Description,Course-level Learning Objectives 

ABE 095,Keys to Academic Success,3.0 ,"Basic .. assessment. "," 

Identify learn. [EXPLORE] 
Evaluate personal, goals. [ACT] 
Utilize development. [EXPLORE] 
" 

格式的文件需要在

Prefix/Code,Name,Credits,Description,Course-level Learning Objectives,Type 

ABE 095,Keys to Academic Success,3.0 ,"Basic .. assessment.","Identify learn.", [EXPLORE] 
ABE 095,Keys to Academic Success,3.0 ,"Basic .. assessment.","Evaluate goals.", [ACT] 
ABE 095,Keys to Academic Success,3.0 ,"Basic .. assessment.","Utilize dev.", [EXPLORE] 
+0

感謝您格式化我的文章。我已經花了數小時的研究來弄清楚如何做到這一點,這讓我感到非常緊張。 billinkc:我注意到我遇到的很多帖子都被你回答了。你能幫我解決一個問題嗎? – Josh

+0

您能否將數據至少暫時存儲在數據庫表中?例如SQL Server? –

+0

是的,SQL Server可以將數據插入表中。 – Josh

回答

0

基本步驟:

  1. 使用獲取的數據插入到表中的SQL Server SSIS
  2. 殺死的「課程 - 級學習目標」龍
  3. 逆透視結果

下面是一個代碼片段‘第N個’指數函數:

CREATE FUNCTION [dbo].[udf_NthIndex] 
        (@Input  VARCHAR(8000), 
        @Delimiter CHAR(1), 
        @Ordinal INT) 

    RETURNS INT 
    AS 

     BEGIN 

     DECLARE @Pointer INT, 
       @Last INT, 
       @Count INT 

     SET @Pointer = 1 
     SET @Last = 0 
     SET @Count = 1 

     WHILE (2 > 1) 
      BEGIN 
      SET @Pointer = CHARINDEX(@Delimiter,@Input,@Pointer) 
      IF @Pointer = 0 
       BREAK 
      IF @Count = @Ordinal 

       BEGIN 
       SET @Last = @Pointer 
       BREAK 
       END 
      SET @Count = @Count + 1 
      SET @Pointer = @Pointer + 1 

      END 

     RETURN @Last 

     END 

    GO 
; 

這種方法,用Common Table Expressions解決,「第n」指數函數,UNPIVOT

WITH s1 
    AS (SELECT 'ABE 095' AS [Prefix/Code] 
       , 'Keys to Academic Success' AS Name 
       , '3.0' AS Credits 
       , 'Basic .. assessment. ' AS Description 
       , ' 

Identify learn. [EXPLORE] 
Evaluate personal, goals. [ACT] 
Utilize development. [EXPLORE] 
' AS [Course-level Learning Objectives] 
     ) , s2 
    AS (SELECT [Prefix/Code] 
       , Name 
       , Credits 
       , Description 
       , dbo.udf_NthIndex([Course-level Learning Objectives] , CHAR(13 
                      ) , 2 
           ) + 2 AS Type1Start 
       , dbo.udf_NthIndex([Course-level Learning Objectives] , CHAR(13 
                      ) , 3 
           ) - dbo.udf_NthIndex([Course-level Learning Objectives] , CHAR(13 
                           ) , 2 
                ) + 0 AS Type1Length 
       , dbo.udf_NthIndex([Course-level Learning Objectives] , CHAR(13 
                      ) , 3 
           ) + 2 AS Type2Start 
       , dbo.udf_NthIndex([Course-level Learning Objectives] , CHAR(13 
                      ) , 4 
           ) - dbo.udf_NthIndex([Course-level Learning Objectives] , CHAR(13 
                           ) , 3 
                ) + 0 AS Type2Length 
       , dbo.udf_NthIndex([Course-level Learning Objectives] , CHAR(13 
                      ) , 4 
           ) + 2 AS Type3Start 
       , dbo.udf_NthIndex([Course-level Learning Objectives] , CHAR(13 
                      ) , 5 
           ) - dbo.udf_NthIndex([Course-level Learning Objectives] , CHAR(13 
                           ) , 4 
                ) + 0 AS Type3Length 
      FROM s1 
     ) , s3 
    AS (SELECT s2.[Prefix/Code] 
       , s2.Name 
       , s2.Credits 
       , s2.Description 
       , RTRIM(LTRIM(SUBSTRING(s1.[Course-level Learning Objectives] , s2.Type1Start , Type1Length 
             ) 
          ) 
        )AS Type1_chunk 
       , RTRIM(LTRIM(SUBSTRING(s1.[Course-level Learning Objectives] , s2.Type2Start , Type2Length 
             ) 
          ) 
        )AS Type2_chunk 
       , RTRIM(LTRIM(SUBSTRING(s1.[Course-level Learning Objectives] , s2.Type3Start , Type3Length 
             ) 
          ) 
        )AS Type3_chunk 
      FROM s1 , s2 
     ) , unpivot1 
    AS (SELECT [Prefix/Code] 
       , Name 
       , Credits 
       , Description 
       , Type_chunk 
      FROM( 
       SELECT [Prefix/Code] 
         , Name 
         , Credits 
         , Description 
         , Type1_chunk 
         , Type2_chunk 
         , Type3_chunk 
        FROM s3 
       )p UNPIVOT(Type_chunk FOR Type_descrip IN(Type1_chunk 
                 , Type2_chunk 
                 , Type3_chunk 
                 ) 
                 )AS unpvt 
     ) 
    SELECT [Prefix/Code] 
     , Name 
     , Credits 
     , Description 
     --, Type_chunk 
     , LEFT(u.Type_chunk , -2 + dbo.udf_NthIndex(u.Type_chunk , '[' , 1 
                ) 
       )AS [Learning Objectives] 
     , RIGHT(u.Type_chunk , 1 + LEN(u.Type_chunk 
             ) - dbo.udf_NthIndex(u.Type_chunk , '[' , 1 
                  ) 
       )AS Type 
     FROM unpivot1 u; 

如果你能夠使用正則表達式,你可以節省一些代碼。在SQL Server 2008中使用RegEx需要CLR。 This book很好地向您展示瞭如何一步一步做到這一點。該解決方案適用於每門課程的少量「類型」值。

0

有消耗文件SSIS 2008中沒有直接的方法,但可以在SSIS 2012中完成,然後調整數據,如果你想在SSIS 2008中完成你應該使用腳本任務來格式化文件,然後在DF中使用平面文件源,在腳本任務中你應該使用的FileReader才達到它,更多信息的劈裂這個文件中看到這個鏈接http://sqlbisam.blogspot.com/2013/12/parsing-data-from-one-row-into-multiple.html

+0

SSIS 2012關閉了更多數據解析工具嗎?我對vb和java很熟悉,所以我理解c#,但是2008年的痛苦不會爲腳本提供調試,這使得如果很難找到我的錯誤在哪裏。 – Josh

+0

如果它是腳本組件,但不能在ssis 2008中調試,但是您可以在腳本任務中進行調試,那麼您必須將解決方案的配置設置爲32位,或者可以始終添加MsgBox(column.ToString())以查看執行期間的結果 – sam