I'd like to parse and index JSON data which come from MQTT.
Lets say that (for now) it is simple time-value JSON:
{"time": "2020-04-07 16:30:00", "value": 40}
I've installed MQTT Modular Input, cloned default "_json" Source Type and named it "simple_json". Only thing I've changed was setting "Timestamp fields" to "time".
I've added new MQTT Data Input:
Now I'm sending single message (using MQTTBox):
{"time": "2020-04-07 16:30:00", "value": 40}
In splunk/data/var/log/splunk/splunkd.log I can see:
04-07-2020 15:17:07.800 +0000 ERROR JsonLineBreaker - JSON StreamId:14709566222301315061 had parsing error:Unexpected character while looking for value: 'T' - data_source="mqtt://simple_json_mqtt", data_host="splunk", data_sourcetype="simple_json"
Search for sourcetype="simple_json" returns no results
Lets try with empty lines before and after json:
`
{"time": "2020-04-07 16:31:00", "value": 41}
04-07-2020 15:19:34.655 +0000 ERROR JsonLineBreaker - JSON StreamId:14709566222301315061 had parsing error:Unexpected character while looking for value: 'T' - data_source="mqtt://simple_json_mqtt", data_host="splunk", data_sourcetype="simple_json"`
In log:
Search for sourcetype="simple_json" returns:
{"time": "2020-04-07 16:31:00", "value": 41}
Ok, now lets try to send two "events" in one MQTT message (with empty line at the end):
`{"time": "2020-04-07 16:32:00", "value": 42}
{"time": "2020-04-07 16:33:00", "value": 43}
04-07-2020 15:23:21.951 +0000 ERROR JsonLineBreaker - JSON StreamId:14709566222301315061 had parsing error:Unexpected character while looking for value: 'T' - data_source="mqtt://simple_json_mqtt", data_host="splunk", data_sourcetype="simple_json"`
In log:
Search for sourcetype="simple_json" returns:
{"time": "2020-04-07 16:33:00", "value": 43}
{"time": "2020-04-07 16:31:00", "value": 41}
So i guess there is some kind of problem with LINE_BREAKER setting in source type (by default set to: ([\r\n]+))
In the real world scenario, I won't be able to control format of JSON messages put in MQTT topic:
- order of fields
- existence of fields (lets say that "time" and "value" will be always there but also other objects/arrays/simple fields may appear)
- LINE_BREAKER
It is even possible to configure input type / source type to be able to parse "anything"?
{"time": "2020-04-07 16:33:00", "value": 43} {"time": "2020-04-07 16:31:00", "value": 41}
It may be extra character, not [\r\n]+
LINE_BREAKER = }(.)
How about this?