2012-03-11 39 views
1

嘿,我試圖找到許多攻擊每天ip每天登錄。我正在從系統日誌文件中讀取數據。python查找每天ip每天的攻擊次數

此行兩行是從

報價閱讀...

Jan 10 09:32:09 j4-be03 sshd[3876]: Failed password for root from 218.241.173.35 port 47084 ssh2 
Jan 10 09:32:19 j4-be03 sshd[3879]: Failed password for root from 218.241.173.35 port 47901 ssh2 
Feb 7 17:19:16 j4-be03 sshd[10736]: Failed password for root from 89.249.209.92 port 46139 ssh2 

這是我的代碼:

desc_date = {}  
count_date = 0 
desc_ip = {} 
count_ip = 0 

for line in myfile: 
    if 'Failed password for' in line:  
     line_of_list = line.split()  
     #working together 
     date_port = ' '.join(line_of_list[0:2]) 
     date_list = date_port.split(':') 
     date = date_list[0] 
     if desc_date.has_key(date): 
      count_date = desc_date[date] 
      count_date = count_date +1 
      desc_date[date] = count_date 
      #zero out the temporary counter as a precaution 
      count_date =0 
     else: 
      desc_date[date] = 1 

     ip_port = line_of_list[-4] 
     ip_list = ip_port.split(':') 
     ip_address = ip_list[0] 
     if desc_ip.has_key(ip_address): 
      count_ip = desc_ip[ip_address] 
      count_ip = count_ip +1 
      desc_ip[ip_address] = count_ip 
      #zero out the temporary counter as a precaution 
      count_ip =0 
     else: 
      desc_ip[ip_address] = 1 

     resulting = dict(desc_date.items() + desc_ip.items()) 
     for result in resulting: 
      print result,' has', resulting[result] , ' attacks' 
目前

給我這些結果是錯誤的:

報價...

Feb 8 has 33 attacks 
218.241.173.35 has 15 attacks 
72.153.93.203 has 14 attacks 
213.251.192.26 has 13 attacks 
66.30.90.148 has 14 attacks 
Feb 7 has 15 attacks 
92.152.92.123 has 5 attacks 
Jan 10 has 28 attacks 
89.249.209.92 has 15 attacks 

它的IP地址是錯誤的,不知道從哪裏代碼腳麻希望有人能幫助

+0

你爲什麼認爲IP地址錯了? – 2012-03-11 23:13:56

+1

如果你編輯你的文章以確保代碼被正確縮進,它會幫助我們。 – BobS 2012-03-11 23:18:09

+0

因爲例如JAN 10 - 有28次攻擊,所以我需要每個IP地址每天匹配28次攻擊 – 2012-03-11 23:18:55

回答

0

殘月:未經測試的代碼。

attacks = {} 

# count the attacks 
for line in file: 
    if 'Failed password for' in line: 
     date = re.match(line, '^(\w{3}\b\d{1,2})\b').group(1) 
     attacks_date = attacks.get(date, {}) 
     ip = re.match(line, '\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b').group(1) 
     attacks_date[ip] = 1 + attacks_date.get(ip, 0) 
     attacks[date] = attacks_date 

# output results 
for item in attacks.items(): 
    date, attacks_date = item 
    print date, 'has', attacks_date.values().sum(), 'attacks' 
    for attack_item in attacks_date.items(): 
     ip, n = attack_item 
     print ip, 'has', n, 'attacks' 
4

嘗試這種解決方案,我的問題與樣品輸入測試它和正常工作:

import re 
from collections import defaultdict 
pattern = re.compile(r'(\w{3}\s+\d{1,2}).+Failed password for .+? from (\S+)') 

def attack_dict(myfile): 
    attacks = defaultdict(lambda: defaultdict(int)) 
    for line in myfile: 
     found = pattern.match(line) 
     if found: 
      date, ip = found.groups() 
      attacks[date][ip] += 1 
    return attacks 

def report(myfile): 
    for date, ips in attack_dict(myfile).iteritems(): 
     print '{0} has {1} attacks'.format(date, sum(ips.itervalues())) 
     for ip, n in ips.iteritems(): 
      print '\t{0} has {1} attacks'.format(ip, n) 

運行這樣的:

report(myfile) # myfile is the opened file with the log 
+2

你可以在這種情況下使用'pattern.match'。 '日期,ip = found.groups()'可能更易讀 – jfs 2012-03-12 04:47:47

+0

@ J.F。塞巴斯蒂安感謝您的建議,我相應地編輯了我的答案 – 2012-03-12 10:45:19

2

我看到兩個問題。 1)你正在計算白天攻擊,IP攻擊和端口攻擊,都是分開的;來自給定IP的攻擊和攻擊日期之間沒有關聯。 2)通過在字典中的項目進行迭代,因爲你在

resulting = dict(desc_date.items() + desc_ip.items()) 
for result in resulting: 
    print result,' has', resulting[result] , ' attacks' 

已經做會給攻擊累積數量在本質上隨機的順序,自由地混合攻擊按IP的攻擊,按日期。你看到

Feb 8 has 33 attacks 

事實......接着

218.241.173.35 has 15 attacks 
72.153.93.203 has 14 attacks 
213.251.192.26 has 13 attacks 
66.30.90.148 has 14 attacks 

...並不意味着通過IP這些襲擊發生在8月

的15次攻擊來自218.241。 173.35表示日誌文件覆蓋的整個時間段內該IP的攻擊總數。 2月8日之後發生的218.241.173.35線是偶然的,而不是在其他日期之前或之後。

+0

我如何實現這一點我知道你的意思是什麼但不確定如何實現它 – 2012-03-12 08:51:22

+0

我將如何繼續取決於我在你的主要帖子的評論中詢問的問題的答案(關於排序順序)。對不起,我在不同的地方問過;這似乎是一個普遍相關的問題,但也許你錯過了,因爲這一點。 – BobS 2012-03-14 03:18:26