2017-06-12 39 views
2

我工作的SSIS,我有複雜的非結構化的文本文件,我不得不通過創建SSIS包來解析文本文件,並獲得所需的列的在DataBase.What數據是解析文本文件的最好方法,我怎麼能寫腳本讀取該文本文件中的每一行。我也困惑我是否可以讀取該文本文件的每一行,而無需編寫腳本?在SSIS解析非結構化的文本文件,讀取每一行,以獲得所需的數據

從文本文件數據所需的列是DEVICEID,DATAVALUE和dataUnits中:

這裏是文本文件:

12/02/2015 09:47:44:745 SecureHARTPort version: 1.1.12.0. 

    12/02/2015 09:47:44:745 Connecting and initialing Session to 
    67.40.65.181 Port:5094 Tcp 
    12/02/2015 09:47:44:745 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 0 
    Status: 0x00 
    TranId: 1, Data ByteCount: 5 
    Data: 01 00 09 27 C0 

    12/02/2015 09:47:44:761 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 0 
    Status: 0x00 
    TranId: 1, Data ByteCount: 5 
    Data: 01 00 09 27 C0 
    12/02/2015 09:47:44:855 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 3 
    Status: 0x00 
TranId: 2, Data ByteCount: 5 
Data: 02 80 00 00 82 

12/02/2015 09:47:44:855 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 3 
Status: 0x00 
TranId: 2, Data ByteCount: 29 
Data: 06 80 00 18 00 50 FE 26 4E 05 07 05 02 0E 0C 0B 6A 64 05 04 00 01 50 
00 26 00 26 84 8E 

Rx Cmd=0, Rsp code=0x00, Device Status=0x50 
Expansion Code=254 
Expanded Device Type=9806 
# Request Preambles=5 
Universal Comand Revision Level=7 
Transmitter HART Revision Level=5 
Software Revision=2 
Hardware Revision Level/Physical Signaling Code=14 
Flags=0C 
Device ID=748132 
Minimum # Response Preambles=5 
Max # of device variables=4 
Configuration Change Counter=1 
Extended Field Device Status=50 
Manufacturer's ID=38 
Private Label Distributor=38 
Device Profile=132 

12/02/2015 09:47:44:855 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 3 
Status: 0x00 
TranId: 3, Data ByteCount: 9 
    Data: 82 A6 4E 0B 6A 64 14 00 7B 

    12/02/2015 09:47:44:870 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 3 
    Status: 0x00 
    TranId: 3, Data ByteCount: 43 
    Data: 86 A6 4E 0B 6A 64 14 22 00 50 77 69 68 61 72 74 67 77 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0C 

Rx Cmd=20, Rsp code=0x00, Device Status=0x50 
Long Tag=wihartgw 

    12/02/2015 09:47:44:870 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 3 
Status: 0x00 
TranId: 4, Data ByteCount: 9 
Data: 82 A6 4E 0B 6A 64 4A 00 25 

12/02/2015 09:47:44:886 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 3 
Status: 0x00 
TranId: 4, Data ByteCount: 19 
    Data: 86 A6 4E 0B 6A 64 4A 0A 00 50 01 01 65 00 05 02 01 03 1B 

    Rx Cmd=74, Rsp code=0x00, Device Status=0x50 
Max Num IO Cards=1 
Max Num Channels per IO Card=1 
Max Num Sub-Devices per Channel=101 
    Num Devices Detected=5 
    Max Num DR Supported=2 
    Master Mode for Comm=1 
    Retry Count for Sub-Device=3 

    Rx Cmd=9, Rsp code=0x00, Device Status=0x50 
    Extended Device Status=0 
    Slot0 Var Code=246 
    Slot0 Var Classification=0 
    Slot0 Var Units=251 
    Slot0 Var Value=4 
    Slot0 Var Status=C0 
    Slot1 Var Code=116 
    Slot1 Var Classification=209 
    Slot1 Var Units=70 
Slot1 Var Value=0 

回答

0

你一定要使用腳本任務來處理這個問題。

腳本任務可以使用文件系統對象,以獲取該文件的引用,一行行讀它,尋找喜歡的字符串:

Device ID=xxx 
Value=xxx 
Units=xxx 

和獲得任何的xxx值在每種情況下,並將其插入數據庫。

2

不知道這是否會幫助你,但你一樣可以先閱讀下面的一個T-SQL腳本文本行由行,然後使用適當的過濾器:

DECLARE @YourText NVARCHAR(MAX)= 
N' 12/02/2015 09:47:44:745 SecureHARTPort version: 1.1.12.0. 

    12/02/2015 09:47:44:745 Connecting and initialing Session to 
    67.40.65.181 Port:5094 Tcp 
    12/02/2015 09:47:44:745 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 0 
    Status: 0x00 
    TranId: 1, Data ByteCount: 5 
    Data: 01 00 09 27 C0 

    12/02/2015 09:47:44:761 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 0 
    Status: 0x00 
    TranId: 1, Data ByteCount: 5 
    Data: 01 00 09 27 C0 
    12/02/2015 09:47:44:855 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 3 
    Status: 0x00 
TranId: 2, Data ByteCount: 5 
Data: 02 80 00 00 82 

12/02/2015 09:47:44:855 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 3 
Status: 0x00 
TranId: 2, Data ByteCount: 29 
Data: 06 80 00 18 00 50 FE 26 4E 05 07 05 02 0E 0C 0B 6A 64 05 04 00 01 50 
00 26 00 26 84 8E 

Rx Cmd=0, Rsp code=0x00, Device Status=0x50 
Expansion Code=254 
Expanded Device Type=9806 
# Request Preambles=5 
Universal Comand Revision Level=7 
Transmitter HART Revision Level=5 
Software Revision=2 
Hardware Revision Level/Physical Signaling Code=14 
Flags=0C 
Device ID=748132 
Minimum # Response Preambles=5 
Max # of device variables=4 
Configuration Change Counter=1 
Extended Field Device Status=50 
Manufacturer''s ID=38 
Private Label Distributor=38 
Device Profile=132 

12/02/2015 09:47:44:855 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 3 
Status: 0x00 
TranId: 3, Data ByteCount: 9 
    Data: 82 A6 4E 0B 6A 64 14 00 7B 

    12/02/2015 09:47:44:870 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 3 
    Status: 0x00 
    TranId: 3, Data ByteCount: 43 
    Data: 86 A6 4E 0B 6A 64 14 22 00 50 77 69 68 61 72 74 67 77 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0C 

Rx Cmd=20, Rsp code=0x00, Device Status=0x50 
Long Tag=wihartgw 

    12/02/2015 09:47:44:870 Tx: Message Header: Ver: 1, MsgType: 0, MsgId: 3 
Status: 0x00 
TranId: 4, Data ByteCount: 9 
Data: 82 A6 4E 0B 6A 64 4A 00 25 

12/02/2015 09:47:44:886 Rx: Message Header: Ver: 1, MsgType: 1, MsgId: 3 
Status: 0x00 
TranId: 4, Data ByteCount: 19 
    Data: 86 A6 4E 0B 6A 64 4A 0A 00 50 01 01 65 00 05 02 01 03 1B 

    Rx Cmd=74, Rsp code=0x00, Device Status=0x50 
Max Num IO Cards=1 
Max Num Channels per IO Card=1 
Max Num Sub-Devices per Channel=101 
    Num Devices Detected=5 
    Max Num DR Supported=2 
    Master Mode for Comm=1 
    Retry Count for Sub-Device=3 

    Rx Cmd=9, Rsp code=0x00, Device Status=0x50 
    Extended Device Status=0 
    Slot0 Var Code=246 
    Slot0 Var Classification=0 
    Slot0 Var Units=251 
    Slot0 Var Value=4 
    Slot0 Var Status=C0 
    Slot1 Var Code=116 
    Slot1 Var Classification=209 
    Slot1 Var Units=70 
Slot1 Var Value=0'; 

--The查詢將切割線在CHAR(13) and/or CHAR(10)任意組合:

WITH LineByLine AS 
(
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS LineNr 
      ,LTRIM(RTRIM(x.value(N'(text())[1]',N'nvarchar(max)'))) AS Line 
    FROM 
    (
    SELECT CAST(N'<x>' + REPLACE((SELECT REPLACE(REPLACE(REPLACE(@YourText,NCHAR(10),NCHAR(13)),NCHAR(13)+NCHAR(13),NCHAR(13)),NCHAR(13),N'\nl') AS [*] FOR XML PATH('')),N'\nl',N'</x><x>') + N'</x>'AS XML) AS Casted 
    ) AS t 
    CROSS APPLY Casted.nodes(N'/x[text()]') AS A(x) 
) 
SELECT LineNr,Line 
FROM LineByLine 
WHERE CHARINDEX('Device ID=',Line)>0 
    OR CHARINDEX('Data:',Line)>0 
    OR CHARINDEX('unit',Line)>0; 

其結果將是:

Nr Line 
7 Data: 01 00 09 27 C0 
11 Data: 01 00 09 27 C0 
15 Data: 02 80 00 00 82 
19 Data: 06 80 00 18 00 50 FE 26 4E 05 07 05 02 0E 0C 0B 6A 64 05 04 00 01 50 
30 Device ID=748132 
41 Data: 82 A6 4E 0B 6A 64 14 00 7B 
45 Data: 86 A6 4E 0B 6A 64 14 22 00 50 77 69 68 61 72 74 67 77 00 00 00 00 00 
52 Data: 82 A6 4E 0B 6A 64 4A 00 25 
56 Data: 86 A6 4E 0B 6A 64 4A 0A 00 50 01 01 65 00 05 02 01 03 1B 
69 Slot0 Var Units=251 
74 Slot1 Var Units=70 

你沒有說明你預期的輸出,也不是你的文本中的規定列名,因此這是猜測...希望它可以幫助...

+0

謝謝,但我需要幫助怎麼寫腳本腳本組件來讀取每一行,因爲我不熟悉的硬編碼。 – a5656

+0

@akhil同步腳本部件有一個稱爲InputBuffer0_ProcessInputRow方法被稱爲在每一行上。 [閱讀此MSDN文章中的更多內容](https://docs.microsoft.com/en-us/sql/integration-services/extending-packages-scripting-data-flow-script-component-types/creating-a-synchronous - 使用腳本組件進行轉換),但是你必須在c#或Vb.net中編寫這個腳本 – Hadi

相關問題