2015-05-14 122 views
0

我有一個在它嵌入XML之間正常的STDOUT如下日誌文件操作日誌文件:解析和嵌入式XML

2015-05-06 04:07:37.386 [INFO]Process:102 - Application submitted Successfully ==== 1 
<APPLICATION><FirstName>Test</FirstName><StudentSSN>123456789</StudentSSN><Address>123 Test Street</Address><ParentSSN>123456780</ParentSSN><APPLICATIONID>2</APPLICATIONID></APPLICATION> 
2015-05-06 04:07:39.386 [INFO] Process:103 - Application completed Successfully ==== 1 
2015-05-06 04:07:37.386 [INFO]Process:104 - Application submitted Successfully ==== 1 
<APPLICATION><FirstName>Test2</FirstName><StudentSSN>323456789</StudentSSN><Address>234 Test Street</Address><ParentSSN>123456780</ParentSSN><APPLICATIONID>2</APPLICATIONID></APPLICATION> 
2015-05-06 04:07:39.386 [INFO] Process:105 - Application completed Successfully ==== 1 

我的目標是分析此文件,並替換爲個人數據的任何出現次數***。因此,上述腳本之後的所需輸出應爲:

2015-05-06 04:07:37.386 [INFO]Process:102 - Application submitted Successfully ==== 1 
<APPLICATION><FirstName>***</FirstName><StudentSSN>***</StudentSSN><Address>*******</Address><ParentSSN>*********</ParentSSN> <APPLICATIONID>2</APPLICATIONID></APPLICATION> 
2015-05-06 04:07:39.386 [INFO] Process:103 - Application completed Successfully ==== 1 
2015-05-06 04:07:37.386 [INFO]Process:104 - Application submitted Successfully ==== 1 
<APPLICATION><FirstName>***</FirstName><StudentSSN>*********</StudentSSN><Address>*****</Address><ParentSSN>*********</ParentSSN> <APPLICATIONID>2</APPLICATIONID></APPLICATION> 
2015-05-06 04:07:39.386 [INFO] Process:105 - Application completed Successfully ==== 1 

在此先感謝您。

+1

問題在xml中是否有來自你的拼寫錯誤,或者來自生成日誌的應用程序? –

+0

@Guido標籤中的任何問題(如空格)都是拼寫錯誤。但是,xml是在日誌文件中生成的,就像我在該行的開始標記之上是「Application」並在那之後關閉一樣。這有道理嗎? –

+1

老實說,不行;你有'FirstName'元素沒有正確關閉,並且對於'applicationid'元素是相同的:它是這樣生成的,即。它是無效的xml,或者它只是在你發佈的代碼片段中(並且日誌文件中的xml被認爲是有效的xml)? –

回答

2

創建文件foo.sed與此內容:

s|<FirstName>[^<]*</FirstName>|<FirstName>***</FirstName>| 
s|<StudentSSN>[^<]*</StudentSSN>|<StudentSSN>***</StudentSSN>| 
s|<Address>[^<]*</Address>|<Address>***</Address>| 
s|<ParentSSN>[^<]*</ParentSSN>|<ParentSSN>***</ParentSSN>| 

並嘗試用這個sed的GNU: 「到位」

sed -f foo.sed log_file > new_file 

或編輯文件:

sed -i -f foo.sed log_file 
+0

做到了。天才!非常感謝你。我沒有意識到這會很容易! –

+0

@Guido非常感謝你的時間! –

+1

我已經添加了回來的內容以及用我的新問題在http://stackoverflow.com/questions/30249841/using-wildcards-with-sed –