2013-10-15 68 views
1

我來分析電子郵件發送日誌文件(得到一個消息ID SMTP回覆),它看起來像這樣:Python的分析日誌文件與正則表達式

Nov 12 17:26:57 zeus postfix/smtpd[23992]: E859950021DB1: client=pegasus.os[172.20.19.62] 
Nov 12 17:26:57 zeus postfix/cleanup[23995]: E859950021DB1: message-id=a92de331-9242-4d2a-8f0e-9418eb7c
Nov 12 17:26:58 zeus postfix/qmgr[22359]: E859950021DB1: from=<[email protected]>, size=114324, nrcpt=1 (queue active) 
Nov 12 17:26:58 zeus postfix/smtp[24007]: certificate verification failed for mx.elutopia.it[62.149.128.160]:25: untrusted issuer /C=US/O=RTFM, Inc./OU=Widgets Division/CN=Test CA20010517 
Nov 12 17:26:58 zeus postfix/smtp[24007]: E859950021DB1: to=<[email protected]>, relay=mx.elutopia.it[62.149.128.160]:25, delay=0.89, delays=0.09/0/0.3/0.5, dsn=2.0.0, status=sent (250 2.0.0 d3Sx1m03q0ps1bK013Sxg4 mail accepted for delivery) 
Nov 12 17:26:58 zeus postfix/qmgr[22359]: E859950021DB1: removed 
Nov 12 17:27:00 zeus postfix/smtpd[23980]: connect from pegasus.os[172.20.19.62] 
Nov 12 17:27:00 zeus postfix/smtpd[23980]: setting up TLS connection from pegasus.os[172.20.19.62] 
Nov 12 17:27:00 zeus postfix/smtpd[23980]: Anonymous TLS connection established from pegasus.os[172.20.19.62]: TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits) 
Nov 12 17:27:00 zeus postfix/smtpd[23992]: disconnect from pegasus.os[172.20.19.62] 
Nov 12 17:27:00 zeus postfix/smtpd[23980]: 2C04150101DB2: client=pegasus.os[172.20.19.62] 
Nov 12 17:27:00 zeus postfix/cleanup[23994]: 2C04150101DB2: message-id=21e2f9d3-154a-3683-85d3-a7c52d429386 
Nov 12 17:27:00 zeus postfix/qmgr[22359]: 2C04150101DB2: from=<[email protected]>, size=53237, nrcpt=1 (queue active) 
Nov 12 17:27:00 zeus postfix/smtp[24006]: ABE7C50001D62: to=<[email protected]>, relay=relay3.telnew.it[195.36.1.102]:25, delay=4.9, delays=0.1/0/4/0.76, dsn=2.0.0, status=sent (250 2.0.0 r9EFQt0J009467 Message accepted for delivery) 
Nov 12 17:27:00 zeus postfix/qmgr[22359]: ABE7C50001D62: removed 
Nov 12 17:27:00 zeus postfix/smtp[23998]: 2C04150101DB2: to=<[email protected]>, relay=liberomx2.elgravo.ch[212.52.84.93]:25, delay=0.72, delays=0.07/0/0.3/0.35, dsn=2.0.0, status=sent (250 ok: Message 2040264602 accepted) 
Nov 12 17:27:00 zeus postfix/qmgr[22359]: 2C04150101DB2: removed 

目前,我得到一個消息-ID( UUID)從數據庫(例如a92de331-9242-4d2a-8f0e-9418eb7c0123),然後運行通過日誌文件我的代碼:

log_id = re.search (']: (.+?): message-id='+message_id, text).group(1) 
sent_status = (re.search (']: '+log_id+'.*dsn=(.....)', text) 

隨着消息的ID我找到LOG_ID,並與LOG_ID我可以找到SMTP回覆答案。

這工作得很好,但更好的辦法是,如果軟件經過日誌文件,得到消息的ID和答覆代碼和更新數據庫即可。但我不確定,我該怎麼做?該腳本必須每2分鐘運行一次,並檢查更新的日誌文件。那麼,我該如何保證它能記住它的位置,並且不會收到兩次消息ID? 在此先感謝

+0

您可以存儲您在數據庫中某處讀取的最後一個消息ID。 – Ashalynd

回答

0

使用字典來存儲消息ID,使用一個單獨的文件來存儲上次離開日誌文件中的字節數。

msgIDs = {} 
# get where you left off in the logfile during the last read: 
try: 
    with open('logfile_placemarker.txt', 'r') as f: 
     lastRead = int(f.read()) 
except IOError: 
    print("Can't find/read place marker file! Starting at 0") 
    lastRead = 0 

with open('logfile.log', 'r') as f: 
    f.seek(lastRead) 
    for line in f: 
     # ... 
     # Pick out msgIDs and response codes 
     # ... 
     if msgID in msgIDs: 
      print("uh oh, found the same msg id twice!!") 
     msgIDs[msgID] = responseCode 
    lastRead = f.tell() 

# Do whatever you need to do with the msgIDs you found: 
updateDB(msgIDs) 
# Store lastRead (where you left off in the logfile) in a file if you need to so it persists in the next run 
with open('logfile_placemarker.txt', 'w') as f: 
    f.write(str(lastRead))