Skip to main content
  1. Posts/

LogStash JSON filter

·351 words·2 mins· loading · loading ·
DevOps Elk

Usage of LogStash JSON filter is very simple and it is described in the official docs. All you need is create a special object mapping in your index:

 1PUT /logstash/_mapping?pretty HTTP/1.0
 2Content-Type: application/json
 3
 4{
 5  "properties": {
 6    "data": {
 7      "type": "object",
 8      "dynamic": true
 9    }
10  }
11}

And add to the LogStash config something like this:

1filter {
2    json {
3        skip_on_invalid_json => true
4        source => "message"
5        target => "data"
6        add_tag => [ "_message_json_parsed" ]
7    }
8}

But what can go wrong? In this configuration you can see a lot of warnings like this:

 1[2020-07-06T18:51:37,837][WARN ][logstash.outputs.elasticsearch][main][...]
 2Could not index event to Elasticsearch.
 3{
 4	:status=>400,
 5	:action=>[
 6		"index", {
 7			:_id=>nil,
 8			:_index=>"logstash",
 9			:routing=>nil,
10			:_type=>"_doc"
11		},
12		#<LogStash::Event:0x2bd37cf7>
13	],
14	:response=>{
15		"index"=>{
16			"_index"=>"logstash-2020.07.06-000007",
17			"_type"=>"_doc",
18			"_id"=>"1wN4JXMBewtat5szJUjd",
19			"status"=>400,
20			"error"=>{
21				"type"=>"mapper_parsing_exception",
22				"reason"=>"object mapping for [data] tried to parse field [data] as object, but found a concrete value"
23			}
24		}
25	}
26}

What it means? LogStash JSON parser is not so strict and if a message doesn’t contain a valid JSON, but a valid string, the data field will contain only this string, but not an “object”.

Moreover, if this happens after a log rotation, it could create a data field mapped to the string type, which can cause more problems, like required index re-creation, etc.

To avoid this, you need to upgrade you LogStash configuration with additional logic:

 1filter {
 2    json {
 3        skip_on_invalid_json => true
 4        source => "message"
 5        target => "data"
 6        add_tag => [ "_message_json_parsed" ]
 7    }
 8
 9    if [data] =~ /.*/ {
10        mutate {
11            remove_field => [ "data" ]
12        }
13    }
14}

What it does? LogStash has no ability to check that data is a valid object, so we check that it is not a string. If it is, regex check will fail and we will remove a data field. So now, there will be only properly parsed JSONs in the data property of our logs.

@soar
Author
@soar
Senior SRE/DevOps engineer

Related

Kibana 6.5 upgrade issues
·133 words·1 min· loading · loading
DevOps Elk