I have an event which looks like this:
<134>2019-12-05T16:25:59.731796+11:00 HOSTNAME consolidated_audit: {"affectedEntityList":[{"entityType":"vm","name":"TARGET","uuid":"62b439a7-6c7d-4274-ae35-db06435cec44"}],"alertUid":"VmUpdateAudit","classificationList":["UserAction"],"clientIp":"10.10.0.1","creationTimestampUsecs":"1575523555797505","defaultMsg":"Updated VM TARGET","opEndTimestampUsecs":"1575523555794928","opStartTimestampUsecs":"1575523555698501","operationType":"Update","originatingClusterUuid":"0005407a-59fe-d90d-7ac4-246e9610e720","params":{"annotation":"annotation","hardware_clock_timezone":"timezone","is_agent_vm":"false","memory_mb":"32768","num_cores_per_vcpu":"1","num_vcpus":"8","old_name":"TARGET","vm_name":"TARGET"},"recordType":"Audit","sessionId":"c2ba8526-84f2-4cd0-b1a4-7df762ffa353","severity":"Audit","userName":"admindigital61.jxh01","uuid":"193fd00b-513a-4c80-b40a-a73c6f69191e"}
I'd like to configure auto-extraction of the embedded JSON. I've tried putting KV-MODE=json in props, but Splunk doesn't work it out for itself. I can do it with a combination of rex and pipe to spath in search, but I'd like the fields to be auto-extracted.
Can anyone help?
Hey Jeremy,
In order to parse the JSON automatically - you need to filter out the payload and then do the parsing.
The payload is the actual data within the braces.
We are extracting the data at the index time and then do an indexed extraction as described below.
After adding the below configuration - perform a restart and check.
Please try and let us know.
Add the below lines to your sourcetype in props.conf.
[your_source_type]
INDEXED_EXTRACTIONS = json
01_TRANSFORMS-transform_json_01 = transform_json_01
Modify the transform.conf as below.
[transform_json_01]
REGEX = ^(?:[^\{]+)(.+$)
FORMAT = $1
DEST_KEY = _raw
LOOKAHEAD = 50000
Hi @jeremyhagand6,
Your data doesn't seem like JSON since it starts with this header <134>2019-12-05T16:25:59.731796+11:00 HOSTNAME consolidated_audit:
. In order to auto extract you have two choices :
1- Extract the JSON part of your data into a field and use the spath command to auto extract all the fields. Details about the spath command here : https://docs.splunk.com/Documentation/SplunkCloud/latest/SearchReference/Spath
2- At index time, filter out this part of your data <134>2019-12-05T16:25:59.731796+11:00 HOSTNAME consolidated_audit:
your events will then be pure JSON and the KV extraction with work like a charm. This is only recommended if you're not using that part of your data at all and if you can find your hostname and timestamp in the data.
Let me know if that helps.
Cheers,
David
Hi @DavidHourani, with #1, is there an ability to automate this, or does it have to be included in each search?
Hi @johnansett ,
The auto kv for JSON won't kick in unless the event is a JSON event (not a syslog event).
The simplest way to automate this is with "2". Especially if your timestamp is "in" the JSON event and not the one in the syslog header.
Another way to do this at search time is to use a Macro or eventtype instead of classic index=abc searches.
As a last resort (if you dont want to change the event from syslog to JSON) you can even manually extract the field that you use the most and then make it a JSON event at search time when you need something that's not commonly used.
Let me know if that helps.
Cheers,
David