2011-12-11 52 views
1

我是python的新手,並且已經通過一些使用正則表達式解析日誌的教程。在下面的代碼中,我能夠解析日誌並創建一個遠程IP連接到服務器的文件。我錯過了可以消除創建的out.txt文件中的重複IP的那部分內容。 感謝針對IP的Python日誌解析

import re 
import sys 

infile = open("/var/log/user.log","r") 
outfile = open("/var/log/intruders.txt","w") 

pattern = r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}" 
regexp = re.compile(pattern, re.VERBOSE) 

for line in infile: 
    result = regexp.search(line) 
    if result: 
    outfile.write("%s\n" % (result.group())) 

infile.close() 
outfile.close() 

回答

5

您可以將結果保存迄今所看到的set(),然後尚未被視爲只寫出來的結果。此邏輯很容易添加到您現有的代碼中:

import re 
import sys 

seen = set() 

infile = open("/var/log/user.log","r") 
outfile = open("/var/log/intruders.txt","w") 

pattern = r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}" 
regexp = re.compile(pattern, re.VERBOSE) 

for line in infile: 
    mo = regexp.search(line) 
    if mo is not None: 
    ip_addr = mo.group() 
    if ip_addr not in seen: 
     seen.add(ip_addr) 
     outfile.write("%s\n" % ip_addr) 

infile.close() 
outfile.close()