2016-01-08 89 views
0

我有一個如下所示的日誌文件。如何使用logstash過濾來自log4j文件的JSON數據?

2014-12-24 09:41:29,383 INFO c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-4] in getCSRFToken 
2014-12-24 09:41:29,383 DEBUG c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-4] CSRFToken set successfully. 
2014-12-24 09:44:26,607 INFO c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-8] in getCSRFToken 
2014-12-24 09:44:26,609 DEBUG c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-8] CSRFToken set successfully. 
2014-12-26 09:55:28,399 INFO c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-9] in getCSRFToken 
2014-12-26 09:55:28,401 DEBUG c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-9] CSRFToken set successfully. 
2014-12-26 11:10:32,135 INFO c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-10] in getCSRFToken 
2014-12-26 11:10:32,136 DEBUG c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-10] CSRFToken set successfully. 
2014-12-26 11:12:40,500 INFO c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-7] in getCSRFToken 
2014-12-26 11:12:40,501 DEBUG c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-7] CSRFToken set successfully. 
2015-11-30 16:21:09,145 INFO c.t.t.s.a.i.AnalyticsServiceImpl.captureHit [http-bio-8080-exec-9] EnquiryDetails : {"createdTime":1448880669029,"modifiedTime":null,"active":true,"deleted":false,"deletedOn":-1,"guid":null,"uuid":null,"id":130771,"instanceId":130665,"pos":"","channel":"Web","flightNo":"TWBL2DL2","orig":"BLR","dest":"DEL","cabCls":"ECONOMY","logCls":"Y","noOfPaxs":1,"scheduleEntryId":130661,"travelDateTime":[2015,12,1,21,30],"enquiryDateTime":[2015,11,30,16,21,9,23000000]} 

你會發現,最後一行是由一些JSON數據 我想配置我logstash提取該JSON數據 以下是我logstash配置文件:

input { 
    file { 
    path => "C:/Users/TESTER/Desktop/files/test1.log" 
    type => "test" 
     start_position => "beginning" 
    } 
} 


filter { 
    grok { 
    match => [ "message" , "timestamp : %{DATESTAMP:timestamp}", "severity: %{WORD:severity}", "clazz: %{JAVACLASS:clazz}", "selco: %{NOTSPACE:selco}", "testerField: (?<ENQDTLS>EnquiryDetails :)"] 

     } 
} 


output { 
    elasticsearch { 
     hosts => "localhost" 
     index => "test1" 
    } 
    stdout {} 
} 

然而這是我的logstash輸出:

C:\logstash-2.0.0\bin>logstash -f test1.conf 
io/console not supported; tty will not be manipulated 
Default settings used: Filter workers: 2 
Logstash startup completed 
2016-01-08T08:02:02.029Z TW 2014-12-24 09:41:29,383 INFO c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-4] in getCSRFToken 
2016-01-08T08:02:02.029Z TW 2014-12-24 09:44:26,607 INFO c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-8] in getCSRFToken 
2016-01-08T08:02:02.029Z TW 2014-12-24 09:44:26,609 DEBUG c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-8] CSRFToken set successfully. 
2016-01-08T08:02:02.029Z TW 2014-12-26 09:55:28,399 INFO c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-9] in getCSRFToken 
2016-01-08T08:02:02.029Z TW 2014-12-26 09:55:28,401 DEBUG c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-9] CSRFToken set successfully. 
2016-01-08T08:02:02.029Z TW 2014-12-26 11:10:32,135 INFO c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-10] in getCSRFToken 
2016-01-08T08:02:02.029Z TW 2014-12-26 11:10:32,136 DEBUG c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-10] CSRFToken set successfully. 
2016-01-08T08:02:02.029Z TW 2014-12-24 09:41:29,383 DEBUG c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-4] CSRFToken set successfully. 
2016-01-08T08:02:02.029Z TW 2014-12-26 11:12:40,500 INFO c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-7] in getCSRFToken 
2016-01-08T08:02:02.029Z TW 2015-11-30 16:21:09,145 INFO c.t.t.s.a.i.AnalyticsServiceImpl.captureHit [http-bio-8080-exec-9] EnquiryDetails : {"createdTime":1448880669029,"modifiedTime":null,"active":true,"deleted":false,"deletedOn":-1,"guid":null,"uuid":null,"id":130771,"instanceId":130665,"pos":"","channel":"Web","flightNo":"TWBL2DL2","orig":"BLR","dest":"DEL","cabCls":"ECONOMY","logCls":"Y","noOfPaxs":1,"scheduleEntryId":130661,"travelDateTime":[2015,12,1,21,30],"enquiryDateTime":[2015,11,30,16,21,9,23000000]} 
2016-01-08T08:02:02.029Z TW 2014-12-26 11:12:40,501 DEBUG c.t.t.a.c.LoginController.getCSRFToken [http-bio-8080-exec-7] CSRFToken set successfully. 

有人請告訴我我在做什麼錯在這裏。謝謝

回答

0

我找到了解決我的問題的方法。

input { 
    file { 
    path => "C:/Users/TESTER/Desktop/elk Files 8-1-2015/test1.log" 
     start_position => "beginning" 
    } 
} 


filter { 
    grok { 

    match => {"message" => "%{DATESTAMP:timestamp} %{WORD:severity} %{JAVACLASS:clazz} %{NOTSPACE:selco} (?<ENQDTLS>EnquiryDetails :) (?<JSONDATA>.*)"} 

    add_tag => [ "ENQDTLS"] 


} 

    if "ENQDTLS" not in [tags] {    
    drop { } 
    } 

    mutate { 
    remove_tag => ["ENQDTLS"] 
    } 

    json { 
     source => "JSONDATA" 
    } 

    mutate { 
    remove_field => ["timestamp"] 
    remove_field => ["clazz"] 
    remove_field => ["selco"] 
    remove_field => ["severity"] 
    remove_field => ["ENQDTLS"] 
    remove_field => ["JSONDATA"] 
    } 

} 


output { 
    elasticsearch { 
     hosts => "localhost" 
     index => "test3" 
    } 
    stdout { 
    codec => rubydebug 
    } 
} 

那麼,林這裏做的是過濾掉不包含關鍵字「EnquiryDetails」使用神交,那麼我處理在該行的JSON數據的任何線。 我希望這可以幫助其他任何可能有同樣問題的人。 另外,因爲我是新手。想知道這是否是一個好方法。

+0

在您的示例中,大多數行不是EnquiryDetails。如果在嘗試grok(etc)之前刪除這些行會更有效:if [message]!〜/ EnquiryDetails/{drop {}} .... –

+0

謝謝:)這樣做。 –

1

你不會說你遇到的是「錯誤」,但我們假設你擔心輸出中缺少字段。

首先,在stdout {}輸出節中使用rubydebug或json編解碼器。它會告訴你更多的細節。

其次,它看起來像你的grok {}都搞砸了。 grok {}將輸入字段和一個或多個正則表達式應用於輸入。你給它輸入(「信息」),但這個正則表達式:

"timestamp : %{DATESTAMP:timestamp}" 

,因爲你沒有文字串不符合您輸入「時間戳」。

你需要更多的東西一樣:

"%{DATESTAMP} %{WORD:severity}" (etc) 

我建議設立一個神交{}節拉所有常見的信息關(一切達])。然後,使用另一個來處理不同類型的消息。

+0

謝謝阿蘭,這對我有很大的幫助。我想要的是根據前面的關鍵字處理JSON數據。我已經解決了這個問題,並會在這裏發佈新的配置代碼。 –