2
我無法改變的遺留系統每天抽出5千兆大部分糟糕的XML日誌並且吹掉我的攝取許可證。 每分鐘發生1000次以上的詳細錯誤有兩類,但每隔幾分鐘就有一次真正有趣的輸入。 我想大幅度縮短SED的重複條目,並保留有趣的不變XML日誌文件正則表達式
所以我需要什麼
1的正則表達式匹配各2班煩人的日誌條目(如...」的十進制'...和...'DBNull'...但不偶爾有趣的)。
一個正則表達式匹配每個惱人的錯誤類是很好,我可以做2個SED通過
2.我需要一個捕獲組與時間戳,所以我可以更換一個簡潔版的長XML行 - 但正確時間戳,以免丟失保真度。
我已經得到儘可能此匹配和捕獲創建日期:
(?:<Log).*?(createdDate="\d{2}\/\d{2}\/\d{4}.\d{2}:\d{2}:\d{2}").*?(?:decimal).*?(<\/Log>)
這是接近,但是從一種逆向貪婪的,我匹配從「小數」到遭遇開口日誌聲明的幾個條目前面 發揮各地的負向後看只是給自己一個嚴重的頭痛
樣本數據
<Log type="ERROR" createdDate="11/09/2015 08:13:14" >
<![CDATA[ [108] -- much cruft removed-- SerializationException: There was an error deserializing the object of type Common.DataCtract.QResult. The value '' cannot be parsed as the type 'decimal'. ---> System.Xml.XmlException: The value '' cannot be parsed as the type 'decimal'. ---> System.FormatException: Input string was not in a correct format.
]]></Log>
<Log type="ERROR" createdDate="11/09/2015 08:13:13" >
<![CDATA[ [108] -- much cruft removed-- SerializationException: There was an error deserializing the object of type Common.DataCtract.QResult. The value '' cannot be parsed as the type 'decimal'. ---> System.Xml.XmlException: The value '' cannot be parsed as the type 'decimal'. ---> System.FormatException: Input string was not in a correct format.
]]></Log>
<Log type="ERROR" createdDate="11/09/2015 08:13:12" >
<![CDATA[ [129] Services.DService.D.FailedToAddRQ(Exceptionex, RQEntityrQ, RHeaderEntityrHeader, StringPRef,): FailedToAddRQ()...with parameters [pRef:=123,0,1], [rQ.AffinityCode:=],[Q.thing=thing][rQ.AffinityRQDT:=123],[rHeader.RHeaderIDPK:=123],[rQ.UWriteIDFK:=]
Data.DataAccessLayerException: Conversion from type 'DBNull' to type 'Long' is not valid.
Parameters:
[RETURN_VALUE][ReturnValue] Value: [0]
---> System.InvalidCastException: Conversion from type 'DBNull' to type 'Long' is not valid.
]]></Log>
<Log type="ERROR" createdDate="11/09/2015 08:13:11" >
<![CDATA[ [129] Services.DService.D.FailedToAddRQ(Exceptionex, RQEntityrQ, RHeaderEntityrHeader, StringPRef,): FailedToAddRQ()...with parameters [pRef:=123,0,1], [rQ.AffinityCode:=],[Q.thing=thing][rQ.AffinityRQDT:=123],[rHeader.RHeaderIDPK:=123],[rQ.UWriteIDFK:=]
Data.DataAccessLayerException: Conversion from type 'DBNull' to type 'Long' is not valid.
]]></Log>
<Log type="ERROR" createdDate="11/09/2015 08:13:10" >
<![CDATA[ [231] An actual interesting log entry with a real error message ]]></Log>
<Log type="ERROR" createdDate="11/09/2015 08:13:09" >
<![CDATA[ [108] -- much cruft removed-- SerializationException: There was an error deserializing the object of type Common.DataCtract.QResult. The value '' cannot be parsed as the type 'decimal'. ---> System.Xml.XmlException: The value '' cannot be parsed as the type 'decimal'. ---> System.FormatException: Input string was not in a correct format.
]]></Log>
完美謝謝Casimir - 您對行開頭的日誌文件是正確的。基於sed的解決方案,而不是純粹的正則表達式,並不完全符合我的期望 - 但非常有見地,而且絕對是要走的路 –