2013-05-02 138 views
2

我有一個讀取文件行的​​腳本..和一些行包含錯誤消息..所以我做了一個循環(這裏只是一行)來找到這些行和提取消息:提取消息的正則表達式

import re 

data = "15:31:17 TPP E Line 'MESSAGE': There is a technical problem in the server." 

if (re.findall(".*E Line.*",data)): 
    err = re.match(r'\'MESSAGE\':\s(*$)',data) 
    print err 

我已經和錯誤,當我執行這個腳本:/我想它返回:

There is a technical problem in the server 

回答

4

你並不需要爲這個正則表達式,如果他們都遵循相同的格式:

>>> data = "15:31:17 TPP E Line 'MESSAGE': There is a technical problem in the server." 
>>> data.rsplit(':', 1)[1] 
' There is a technical problem in the server.' 

但如果你必須使用他們...

>>> data = "15:31:17 TPP E Line 'MESSAGE': There is a technical problem in the server." 
>>> ms = re.search(r"'MESSAGE': (.*)$", data) 
>>> ms.group(1) 
'There is a technical problem in the server.' 

如果你想你可以提取的其他信息,以及:

>>> ms = re.match(r"(\d\d:\d\d:\d\d)\s+(\S+)\s+(\S+)\s+Line\s+'MESSAGE':\s+(.*)", data) 
>>> ms.groups() 
('15:31:17', 'TPP', 'E', 'There is a technical problem in the server.') 
1

試試這個:

import re 

data = "15:31:17 TPP E Line 'MESSAGE': There is a technical problem in the server." 

r = re.compile("^.*E Line.*'MESSAGE':[ ]*([^ ].*)$") 
m = r.match(data) 
if m: 
    err = m.group(1) 
    print(err) 

當然,你應該在循環之外編譯正則表達式。