We have users migrating apps (that were using Universal Forwarders) to docker containers. The Splunk logging driver for docker embeds the logged json items inside a 'line' object as per the sanitized example below; these fields are not nested in 'line' when using a UF. There are a number of reports/dashboards/alerts built that won't work with the new logging solution because they're not expecting to have to reference a field with 'line.' - for example, line.port instead of just "port". The desired goal is to extract the json fields out of 'line' and place them back in _raw so the reports/dashboards will work with either implementation.
Example (simplified) event:
{"line":{"_t":"2020-03-27T03:17:25.491296Z","logger":"some.logger","level":"INFO","env":"dev","port":"8000","process_id":51,"thread_id":140005384098624,"hostname":"964619888c0d"},"source":"stdout","tag":"some.instance.tag"}
I'm trying to build a props/transforms solution that extracts the json out of 'line' and places those fields back at the '_raw' event level. Here's what I have so far:
local.meta
[]
export = system
props.conf
[docker_line_extract]
REPORT-line = extract_line_object, extract_line_objects
transforms.conf
[extract_line_object]
REGEX = {\"line\":{(?.*)},
[extract_line_objects]
REGEX = \"(?<_KEY_1>[^="\]+)\":\s?\"?(?<_VAL_1>[^="\]*)
FORMAT = $1::$2
SOURCE_KEY = field:lineobj
DEST_KEY = _raw
REPEAT_MATCH = true
The above succeeds in extracting the json field/values out of 'line' - the 'lineobj' field appears in the fields list in Splunk Web; clicking one reveals the expected content: "_t":"2020-03-27T03:17:25.491296Z","logger":"some.logger","level":"INFO","env":"dev","port":"8000","process_id":51,"thread_id":140005384098624,"hostname":"964619888c0d"
So that part is working. But I can't seem to get the json field/values extracted out of 'lineobj' and placed in the _raw event as desired - tried a lot of variations, no luck. Does anyone have some insights / solution? Thank you.
Figured it out - I was trying too hard. These entries in a props.conf, applied to events with sourcetype 'docker_line_extract' solved the problem. Still see the line. object in the search results, but the extracted fields appear in the fields list, and you can reference and/or build filters with the extracted fields as desired:
[docker_line_extract]
EXTRACT-line = (\{\"line\":\{)?\"(?<_KEY_1>[^=",]+)\":\s?\"?(?<_VAL_1>[^=",]*)
Evidently leveraging the _KEY_1 and _VAL_1 convention will extract all the field/value pairs nested in the 'line' object.
The (\{\"line\":\{)?
part at the start of the regex eliminates trying to extract and include an empty "line" field as well.
| makeresults
| eval _raw="{\"line\":{\"_t\":\"2020-03-27T03:17:25.491296Z\",\"logger\":\"some.logger\",\"level\":\"INFO\",\"env\":\"dev\",\"port\":\"8000\",\"process_id\":51,\"thread_id\":140005384098624,\"hostname\":\"964619888c0d\"},\"source\":\"stdout\",\"tag\":\"some.instance.tag\"}"
| rex mode=sed "s/{.*({.*}).*}/\1/"
| spath
spath
works.
props.conf
[docker_line_extract]
SEDCMD-trim_line = s/{.*({.*}).*}/\1/
KV_MODE = json
you can use KV_MODE
, not need other field extraction.
Figured it out - I was trying too hard. These entries in a props.conf, applied to events with sourcetype 'docker_line_extract' solved the problem. Still see the line. object in the search results, but the extracted fields appear in the fields list, and you can reference and/or build filters with the extracted fields as desired:
[docker_line_extract]
EXTRACT-line = (\{\"line\":\{)?\"(?<_KEY_1>[^=",]+)\":\s?\"?(?<_VAL_1>[^=",]*)
Evidently leveraging the _KEY_1 and _VAL_1 convention will extract all the field/value pairs nested in the 'line' object.
The (\{\"line\":\{)?
part at the start of the regex eliminates trying to extract and include an empty "line" field as well.
And, of course you can add this under a [default] stanza in props.conf if you need it to be applied to any sourcetype:
[default]
EXTRACT-line = (\{\"line\":\{)?\"(?<_KEY_1>[^=",]+)\":\s?\"?(?<_VAL_1>[^=",]*)
[other_sourcetypes]
...
Correction to transforms.conf - should have used the code block the first time - apologies:
[extract_line_object]
REGEX = \{\"line\":\{(?<lineobj>.*)\},
[extract_line_objects]
REGEX = \"(?<_KEY_1>[^="\\]+)\":\s?\"?(?<_VAL_1>[^="\\]*)
FORMAT = $1::$2
SOURCE_KEY = field:lineobj
DEST_KEY = _raw
REPEAT_MATCH = true