2013-07-16 32 views
3

我至今的Python:需要從日誌文件中獲得獨特的錯誤

def unique_ips(): 
f = open('logfile','r') 
ips = set() 
for line in f: 
    ip = line.split()[0] 
    print ip 
    for date in ip: 
     logdate = line.split()[3] 
     print "\t", logdate 
     for entry in logdate: 
      info = line.split()[5:11] 
      print "\t\t", info 
    ips.add(ip) 
unique_ips() 

我有麻煩的那部分:

 for entry in logdate: 
      info = line.split()[5:20] 
      print "\t\t", info 

我有,我有一個日誌文件排序第一的IP,然後通過時間,那麼錯誤

應該是這樣的:

199.21.99.83 
     [30/Jun/2013:07:18:30 
       ['"GET', '/searchme/index.php?f=man_soweth', 'HTTP/1.1"', '200', '8676', '"-"'] 

而是我越來越:

199.21.99.83 
     [30/Jun/2013:07:18:30 
       ['"GET', '/searchme/index.php?f=man_soweth', 'HTTP/1.1"', '200', '8676', '"-"'] 
       ['"GET', '/searchme/index.php?f=man_soweth', 'HTTP/1.1"', '200', '8676', '"-"'] 
       ['"GET', '/searchme/index.php?f=man_soweth', 'HTTP/1.1"', '200', '8676', '"-"'] 
       ['"GET', '/searchme/index.php?f=man_soweth', 'HTTP/1.1"', '200', '8676', '"-"'] 
       ... 

我敢肯定,我遇到了某種語法問題,但希望得到的幫助!

日誌文件看起來像:

99.21.99.83 - - [30/Jun/2013:07:15:50 -0500] "GET /lenny/index.php?f=13 HTTP/1.1" 200 11244 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" 
199.21.99.83 - - [30/Jun/2013:07:16:13 -0500] "GET /searchme/index.php?f=being_fruitful HTTP/1.1" 200 7526 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" 
199.21.99.83 - - [30/Jun/2013:07:16:45 -0500] "GET /searchme/index.php?f=comparing_themselves HTTP/1.1" 200 7369 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" 
66.249.73.40 - - [30/Jun/2013:07:16:56 -0500] "GET /espanol/displayAncient.cgi?ref=isa%2054:3 HTTP/1.1" 500 167 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 
199.21.99.83 - - [30/Jun/2013:07:17:00 -0500] "GET /searchme/index.php?f=tribulation HTTP/1.1" 200 7060 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" 
199.21.99.83 - - [30/Jun/2013:07:17:15 -0500] "GET /searchme/index.php?f=proud HTTP/1.1" 200 7080 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" 
199.21.99.83 - - [30/Jun/2013:07:17:34 -0500] "GET /searchme/index.php?f=soul HTTP/1.1" 200 7063 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" 
199.21.99.83 - - [30/Jun/2013:07:17:38 -0500] "GET /searchme/index.php?f=the_flesh_lusteth HTTP/1.1" 200 6951 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.c 
+0

輸入文件是什麼樣的?另外,別忘了關閉f! – mr2ert

+2

'logdate'似乎是一個字符串,因此通過遍歷它,您可以遍歷每個單獨的字符。您的循環只會爲'logdate'中的每個字符輸出'「\ t \ t」,info'一次。 – Blender

+0

使用日誌文件剪輯編輯問題 – JasonOrtiz

回答

1

的問題是,因爲樣本輸出的有點混亂,但我敢肯定,你想要的東西是這樣的:

def unique_ips(): 
    f = open('logfile','r') 
    ips = {} 
    # This for loop collects all of the ips with their associated errors 
    for line in f: 
     ip = line.split()[0] 
     try: 
      ips[ip].append(line) 
     except KeyError: 
      ips[ip] = [line] 

    # This for loop goes through all the ips that were collected 
    # and prints out all errors for those ips 
    for ip, errors in ips.iteritems(): 
     print ip 
     errors.sort() 
     for e in errors: 
      logdate = e.split()[3] 
      print "\t", logdate 

      info = e.split()[5:11] 
      print "\t\t", info 

    f.close() 

將會產生從您的示例文件輸出:

199.21.99.83 
    [30/Jun/2013:07:16:13 
     ['"GET', '/searchme/index.php?f=being_fruitful', 'HTTP/1.1"', '200', '7526', '"-"'] 
    [30/Jun/2013:07:16:45 
     ['"GET', '/searchme/index.php?f=comparing_themselves', 'HTTP/1.1"', '200', '7369', '"-"'] 
    [30/Jun/2013:07:17:00 
     ['"GET', '/searchme/index.php?f=tribulation', 'HTTP/1.1"', '200', '7060', '"-"'] 
    [30/Jun/2013:07:17:15 
     ['"GET', '/searchme/index.php?f=proud', 'HTTP/1.1"', '200', '7080', '"-"'] 
    [30/Jun/2013:07:17:34 
     ['"GET', '/searchme/index.php?f=soul', 'HTTP/1.1"', '200', '7063', '"-"'] 
    [30/Jun/2013:07:17:38 
     ['"GET', '/searchme/index.php?f=the_flesh_lusteth', 'HTTP/1.1"', '200', '6951', '"-"'] 
66.249.73.40 
    [30/Jun/2013:07:16:56 
     ['"GET', '/espanol/displayAncient.cgi?ref=isa%2054:3', 'HTTP/1.1"', '500', '167', '"-"'] 
99.21.99.83 
    [30/Jun/2013:07:15:50 
     ['"GET', '/lenny/index.php?f=13', 'HTTP/1.1"', '200', '11244', '"-"'] 
1

你有太多的循環。您不需要用於輸入logdate循環。你已經遍歷每一行。

刪除,以便輸入日誌記錄並取消信息分配和打印報表。

(的意見已經提到這一點。)