Usage of LogStash JSON filter is very simple and it is described in the official docs. All you need is create a special object mapping in your index:
And add to the LogStash config something like this:
filter {
json {
skip_on_invalid_json => true
source => "message"
target => "data"
add_tag => [ "_message_json_parsed" ]
}
}
But what can go wrong? In this configuration you can see a lot of warnings like this:
[2020-07-06T18:51:37,837][WARN ][logstash.outputs.elasticsearch][main][...]
Could not index event to Elasticsearch.
{
:status=>400,
:action=>[
"index", {
:_id=>nil,
:_index=>"logstash",
:routing=>nil,
:_type=>"_doc"
},
#<LogStash::Event:0x2bd37cf7>
],
:response=>{
"index"=>{
"_index"=>"logstash-2020.07.06-000007",
"_type"=>"_doc",
"_id"=>"1wN4JXMBewtat5szJUjd",
"status"=>400,
"error"=>{
"type"=>"mapper_parsing_exception",
"reason"=>"object mapping for [data] tried to parse field [data] as object, but found a concrete value"
}
}
}
}
What it means? LogStash JSON parser is not so strict and if a message
doesn't contain a valid JSON, but a valid string, the data
field will contain only this string, but not an "object".
Moreover, if this happens after a log rotation, it could create a data
field mapped to the string
type, which can cause more problems, like required index re-creation, etc.
To avoid this, you need to upgrade you LogStash configuration with additional logic:
filter {
json {
skip_on_invalid_json => true
source => "message"
target => "data"
add_tag => [ "_message_json_parsed" ]
}
if [data] =~ /.*/ {
mutate {
remove_field => [ "data" ]
}
}
}
What it does? LogStash has no ability to check that data
is a valid object, so we check that it is not a string. If it is, regex check will fail and we will remove a data
field. So now, there will be only properly parsed JSONs in the data
property of our logs.