我使用Filebeat解析Windows中的XML文件,並將它們發送到Logstash進行過濾併發送到Elasticsearch。使用Logstash解析Filebeat中的XML數據
Filebeat作業完美,我將XML塊放入Logstash,但它看起來很喜歡我錯誤地配置了Logstash過濾器,將XML塊解析爲分隔的字段並將這些字段封裝到Elasticsearch類型中。
這裏是我的XML樣本數據:
<H_Ticket> <IDH_Ticket>26</IDH_Ticket> <CodeBus>186</CodeBus> <CodeCh>5531</CodeCh> <CodeConv>5531</CodeConv> <Codeligne>12</Codeligne> <Date>20150915</Date> <Heur>1110</Heur> <NomFR1>SOUK AHAD</NomFR1> <NomFR2>KANTAOUI </NomFR2> <Prix>0.66</Prix> <IDTicket>26</IDTicket> <CodeRoute>107</CodeRoute> <origine>01</origine> <Distination>06</Distination> <Num>6</Num> <Ligne>107</Ligne> <requisition> </requisition> <voyage>0</voyage> <faveur> </faveur> </H_Ticket> <H_Ticket> <IDH_Ticket>26</IDH_Ticket> <CodeBus>186</CodeBus> <CodeCh>5531</CodeCh> <CodeConv>5531</CodeConv> <Codeligne>12</Codeligne> <Date>20150915</Date> <Heur>1110</Heur> <NomFR1>SOUK AHAD</NomFR1> <NomFR2>KANTAOUI </NomFR2> <Prix>0.66</Prix> <IDTicket>26</IDTicket> <CodeRoute>107</CodeRoute> <origine>01</origine> <Distination>06</Distination> <Num>6</Num> <Ligne>107</Ligne> <requisition> </requisition> <voyage>0</voyage> <faveur> </faveur> </H_Ticket>> <H_Ticket> <IDH_Ticket>26</IDH_Ticket> <CodeBus>186</CodeBus> <CodeCh>5531</CodeCh> <CodeConv>5531</CodeConv> <Codeligne>12</Codeligne> <Date>20150915</Date> <Heur>1110</Heur> <NomFR1>SOUK AHAD</NomFR1> <NomFR2>KANTAOUI </NomFR2> <Prix>0.66</Prix> <IDTicket>26</IDTicket> <CodeRoute>107</CodeRoute> <origine>01</origine> <Distination>06</Distination> <Num>6</Num> <Ligne>107</Ligne> <requisition> </requisition> <voyage>0</voyage> <faveur> </faveur> </H_Ticket>
這裏是我的logstash配置文件:
input {
beats {
port => 5044
}
}
filter
{
xml
{
source => "ticket"
xpath =>
[
"/ticket/IDH_Ticket/text()", "ticketId",
"/ticket/CodeBus/text()", "codeBus",
"/ticket/CodeCh/text()", "codeCh",
"/ticket/CodeConv/text()", "codeConv",
"/ticket/Codeligne/text()", "codeLigne",
"/ticket/Date/text()", "date",
"/ticket/Heur/text()", "heure",
"/ticket/NomFR1/text()", "nomFR1",
"/ticket/NomAR1/text()", "nomAR1",
"/ticket/NomFR2/text()", "nomFR2",
"/ticket/NomAR2/text()", "nomAR2",
"/ticket/Prix/text()", "prix",
"/ticket/IDTicket/text()", "idTicket",
"/ticket/CodeRoute/text()", "codeRoute",
"/ticket/origine/text()", "origine",
"/ticket/Distination/text()", "destination",
"/ticket/Num/text()", "num",
"/ticket/Ligne/text()", "ligne",
"/ticket/requisition/text()", "requisition",
"/ticket/voyage/text()", "voyage",
"/ticket/faveur/text()", "faveur"
]
store_xml => true
target => "doc"
}
}
output
{
elasticsearch
{
hosts => "localhost"
index => "buses"
document_type => "ticket"
}
file {
path => "C:\busesdata\logstash.log"
}
stdout { codec =>rubydebug}
}
Filebeat配置:
filebeat:
# List of prospectors to fetch data.
prospectors:
paths:
- C:\busesdata\*.xml
input_type: log
document_type: ticket
scan_frequency: 10s
multiline:
pattern: '<H_Ticket'
negate: true
match: after
output:
### Logstash as output
logstash:
hosts: ["localhost:5044"]
index: filebeat
這裏是b的一部分OTH stdout和文件輸出:
PS C:\logstash-2.3.3\bin> .\logstash -f .\logstash_temp.conf
io/console not supported; tty will not be manipulated
Settings: Default pipeline workers: 4
Pipeline main started
{
"message" => "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\r\n<?xml-stylesheet href=\"ticket.xsl\" type=\"text/xsl\"?>\n<HF_DOCUMENT>",
"@version" => "1",
"@timestamp" => "2016-07-03T12:13:28.892Z",
"source" => "C:\\busesdata\\ticket2.xml",
"type" => "ticket",
"input_type" => "log",
"fields" => nil,
"beat" => {
"hostname" => "hp-pavillion-g6",
"name" => "hp-pavillion-g6"
},
"offset" => 0,
"count" => 1,
"host" => "hp-pavillion-g6",
"tags" => [
[0] "beats_input_codec_plain_applied"
]
}
{
"message" => "\t<H_Ticket>\r\n\t\t<IDH_Ticket>1</IDH_Ticket>\r\n\t\t<CodeBus>186</CodeBus>\r\n\t\t<CodeCh>5531</CodeCh>\r\n\t\t<CodeConv>5531</CodeConv>\r\n\t\t<Codeligne>12</Codeligne>\r\n\t\t<Date>20150903</Date>\r\n\t\t<Heur>1101</Heur>\r\n\t\t<NomFR1>SOUK AHAD</NomFR1>\r\n\t\t<NomAR1>??? ?????</NomAR1>\r\n\t\t<NomFR2>SOVIVA </NomFR2>\r\n\t\t<NomAR2>??????</NomAR2>\r\n\t\t<Prix>0.66</Prix>\r\n\t\t<IDTicket>1</IDTicket>\r\n\t\t<CodeRoute>107</CodeRoute>\r\n\t\t<origine>01</origine>\r\n\t\t<Distination>07</Distination>\r\n\t\t<Num>3</Num>\r\n\t\t<Ligne>107</Ligne>\r\n\t\t<requisition> </requisition>\r\n\t\t<voyage>0</voyage>\r\n\t\t<faveur> </faveur>\r\n\t</H_Ticket>",
"@version" => "1",
"@timestamp" => "2016-07-03T12:13:28.892Z",
"input_type" => "log",
"source" => "C:\\busesdata\\ticket2.xml",
"offset" => 125,
"type" => "ticket",
"count" => 1,
"fields" => nil,
"beat" => {
"hostname" => "hp-pavillion-g6",
"name" => "hp-pavillion-g6"
},
"host" => "hp-pavillion-g6",
"tags" => [
[0] "beats_input_codec_plain_applied"
]
}
可以粘貼的'logstash',使'標準輸出{編解碼器=> rubydebug輸出}'? – Arpit
我認爲這是一個映射的問題,在ES中手動設置類型映射並再次嘗試後,Logstash沒有向ES發送任何數據......我很確定這是一個過濾問題:/ –