Splunk Gurus -
I've yet not absorbed JSON data in my setup, but I'm anticipating many sources in near future generating lot of JSON data., I wanted to gather some inputs from this group -
I've read that use of spath is the way to work with JSON objects. In this regard,
thanks for your inputs
Is there any way to make a single extract that runs before KV_MODE = JSON kicks in? I keep seeing cases where just the body of the message is json, listed after the date, level and logger. In that case, you have to run the rex first followed by the spath.
To the original question...
If the whole body of the event is json, then the fields will automatically be extracted. It "just works." You can summarize like any other extracted field. I've never measured the performance, but it seems pretty good, but just think about what it's doing... you probably don't want to convert all of your logs to json.
The only caveat I can give is to avoid complicated json documents if you can control the documents. For example, if you have arrays of objects that are actually each an event, you'll have to do some gymnastics with mvexpand to keep the fields in each nested event related.
If your data is all json then you want this in your props:
KV_MODE = JSON
For my JSON data I do little else. For my large data sets I use a tsindx to store what I need for speed.
If not pure JSON you need the json in a filed then extract
| rex field=_raw "(?s)(?<xxx>match_json_here)" | spath input=xxx