0

我想使用Logstash解析nginx日誌,一切看起來不錯,除了得到這個_grokparsefailure標籤的行含有Nginx $ remote_user。當$ REMOTE_USER是「 - 」(默認指定$ REMOTE_USER當值),Logstash做的工作,但與真正的$ REMOTE_USER像[email protected]失敗,並把_grokparsefailure標籤:解析Nginx日誌時Logstash _grokparsefailure

127.0.0.1 - - [17/Feb/2017:23:14:08 +0100]「GET /favicon.ico HTTP/1.1」302 169「http://training-hub.tn/trainer/」「Mozilla/5.0(X11; Linux x86_64)AppleWebKit/537.36(KHTML,如Gecko )鉻/ 56.0.2924.87 Safari瀏覽器/ 537.36"

=====>作品細

127.0.0.1 - [email protected] [17/Feb/2017:23:14:07 +0100]「GET /trainer/templates/home.tmpl.html HTTP/1.1」304 0 「http://training-hub.tn/trainer/」「Mozilla /5.0(X11; Linux的x86_64的) 爲AppleWebKit/537.36(KHTML,例如Gecko)瀏覽器/ 56.0.2924.87 的Safari/537.36"

=====>_grokparsefailure標籤和無法解析日誌行

我使用這個配置文件:

input {  
    file {  
     path => "/home/dev/node/training-hub/logs/access_log"  
     start_position => "beginning"  
     sincedb_path => "/dev/null" 
     ignore_older => 0 
     type => "logs" 
    } 
} 

filter {  
    if[type] == "logs" {   
     mutate {    
      gsub => ["message", "::ffff:", ""]   
     }  
     grok {   
      match=> [ 
       "message" , "%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}", 
       "message" , "%{COMMONAPACHELOG}+%{GREEDYDATA:extra_fields}" 
      ] 
      overwrite=> [ "message" ] 
     } 

     mutate { 
      convert=> ["response", "integer"] 
      convert=> ["bytes", "integer"] 
      convert=> ["responsetime", "float"] 
     } 
     geoip { 
      source => "clientip" 
      target => "geoip" 
      database => "/etc/logstash/GeoLite2-City.mmdb" 
      add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ] 
      add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ] 
     } 
     mutate { 
      convert => [ "[geoip][coordinates]", "float"] 
     } 

     date { 
      match=> [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ] 
      remove_field=> [ "timestamp" ] 
     } 

     useragent { 
      source=> "agent" 
     } 
    } 
} 

output { elasticsearch {   hosts => "localhost:9200" } } 

回答

0

許多值測試輸出後,我意識到,Logstash無法解析登錄含有此類$remote_user,因爲它不是一個有效的用戶名(電子郵件地址)線,所以我已經添加了一個mutate gsub篩選以刪除@和郵件地址的其餘部分以生成有效的$remote_user

GSUB => [ 「消息」, 「@ + A-Z0-9(:(?:一個-Z0-9?)|?[(:(?: 25 [O- 5] | 2 [0-4] [0-9] | [01] [0-9] [0-9])){3}(?: 25 [0-5] | 2 [O- 4] [0-9] | [01] [0-9] [0-9] | [A-Z0-9 - ] * [A-Z0-9]:???(:[\ x01- \ X08 \ x0b \ x0c \ x0e- \ x1f \ x21- \ x5a \ x53- \ x7f] | \ [\ x01- \ x09 \ x0b \ x0c \ x0e- \ x7f])+)]) [「,」[ ]

而現在,它工作正常