There all kinds of questions (and not too many answers) about processing nested JSON, either at the source or in search. I have some nested JSON that the spath
command can extract the fields from, but the display in the Search & Reporting app is still only one JSON level deep. For example:
{ [-]
log: {"message":"looks like we got no XML document","context":{"status":400,"traceId":"aacb332c-e907-352b-9f8b-a72a55d75cd0","path":"somepath","method":"GET","account_id":1234},"level":200,"level_name":"INFO","channel":"lumen","datetime":{"date":"2018-10-17 20:49:01.839792","timezone_type":3,"timezone":"UTC"},"extra":[]}
stream: stdout
time: 2018-10-17T20:49:01.841051338Z
}
The spath
command successfully extracts the fields in the "log" element, but I'd like to actually see the "log" properly formatted:
{
"channel": "lumen",
"context": {
"account_id": 1234,
"method": "GET",
"path": "somepath",
"status": 400,
...etc
"message": "looks like we got no XML document"
}
Anyway to do this in a search?
Try below configurations on Indexer or Heavy Forwarder whichever comes first from Universal Forwarder and remove INDEXED_EXTRACTIONS = json
on Universal Forwarder
props.conf
[yourSourcetype]
SHOULD_LINEMERGE=true
NO_BINARY_CHECK=true
SEDCMD-removeslash=s/(?:\\"|\\\\")/"/g
SEDCMD-removenewline=s/\\\\n//g
TIME_PREFIX="time":\s"
MAX_TIMESTAMP_LOOKAHEAD=30
In above configuration I was not able to parse \\n
to new line so I have removed that using SEDCMD so you will see long string in log.message
field without new lines which might look ugly otherwise splunk is extracting all required field which you require based on below sample data.
{ "log": {\"message\":\"\\n\u003c?xml version=\\\"1.0\\\" encoding=\\\"utf-8\\\"?\u003e\\n\u003c!DOCTYPE\","context":{"status":400,"traceId":"aacb332c-e907-352b-9f8b-a72a55d75cd0","path":"somepath","method":"GET","account_id":1234},"level":200,"level_name":"INFO","channel":"lumen","datetime":{"date":"2018-10-17 20:49:01.839792","timezone_type":3,"timezone":"UTC"},"extra":[]}, "stream": "stdout", "time": "2018-10-17T20:49:01.841051338Z" }
Great answer. I won't be able to test this for a while but I am going to reference it in my future configs.
I have converted my comment to answer if it will work for you then you can accept it as answer.
We do have that set AFAIK. However, a closer look at the raw entry:
{"log":"{\"message\":\"\\n\u003c?xml version=\\\"1.0\\\" encoding=\\\"utf-8\\\"?\u003e\\n\u003c!DOCTYPE...
shows "log" is actually a string and not a JSON. You have to feed it into a formatter without the surrounding quotes.
That being said, appending "spath input=log" to the query will extract all the fields in the string "log", it just won't pretty print the results.
I don't think " INDEXED_EXTRACTIONS = json" can account for this without some customization.
Hi @wsanderstii,
If you are ingesting this data into Splunk using Splunk Universal Forwarder then can you please try below configuration on your Universal Forwarder?
props.conf
[yourSourcetype]
INDEXED_EXTRACTIONS = json
And then restart splunk service on splunk universal forwarder.