- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How do I extract key values from a json object?
Consider I have multiple such JSON events pushed to splunk.
{
"orderNum" : "1234",
"orderLocation" : "demoLoc",
"details":{
"key1" : "value1",
"key2" : "value2"
}
}
I am trying to figure out a spunk query that would give me the following output in a table
orderNum | key | value | orderLocation |
1234 | key1 | value1 | demoLoc |
1234 | key2 | value2 | demoLoc |
the value from the key-value pair can be an escaped JSON string. we also need to consider this while writing regex.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is kind of sed-y but it should work: (assuming your automatic kv field extraction is working on your json event)
| spath input=_raw path=details output=hold
| rex field=hold mode=sed "s/({\s*|\s*}|,\s*)//g"
| makemv hold delim="\"\""
| mvexpand hold
| rex field=hold "(?<key>[^,\s\"]*)\"\s:\s\"(?<value>[^,\s\"]*)" max_match=0
| table orderNum key value orderLocation
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @marnall,
I appreciate your response, it did work for me after tuning it a bit
| spath input=_raw path=details output=hold
| rex field=hold mode=sed "s/({\s*|\s*}|\s*)//g"
| makemv hold delim=","
| mvexpand hold
| rex field=hold "\"(?<key>[^\"]+)\":\"(?<value>[^\"]+)\""
| table orderNum key value orderLocation
However, I have a follow-up on this.
our admin has set some limitations for mvexpand command when I try to increase the search period for the last 3 days,
I get this warning:
output will be truncated at 6300 results due to excessive memory usage. Memory threshold of 500MB as configured in limits.conf / [mvexpand] / max_mem_usage_mb has been reached.
Is there any alternate way to achieve the same results without using mvexpand, considering that on average there can be more than 50-60 key-value present under the "details" of a single event, and there can be 40K events per 30 days (assuming the retention period of the events is 30 days)?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does this work better?
| spath input=_raw path=details output=hold
| rex field=hold "\"(?<kvs>[^\"]*\"*[^\"]*\"*[^\"]*\"*)\"" max_match=0
| stats values(*) as * by kvs
| rex field=kvs "(?<key>[^\"]*)\" : \"(?<value>[^\"]*)" max_match=0
| table orderNum key value orderLocation
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
did not work,
| makeresults
| eval _raw = "{\"orderNum\":\"1234\",\"orderLocation\":\"demoLoc\",\"details\":{\"key1\":\"value1\",\"key2\":\"value2\"}}"
| spath
| spath input=_raw path=details output=hold
| rex field=hold "\"(?<kvs>[^\"]*\"*[^\"]*\"*[^\"]*\"*)\"" max_match=0
| stats values(*) as * by kvs
| rex field=kvs "(?<key>[^\"]*)\" : \"(?<value>[^\"]*)" max_match=0
| table orderNum key value orderLocation
the value from the key-value pair can be an escaped JSON string. we also need to consider this while writing regex.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Small tweak to the regex: (removing two space characters from the second-to-last line)
| makeresults
| eval _raw = "{\"orderNum\":\"1234\",\"orderLocation\":\"demoLoc\",\"details\":{\"key1\":\"value1\",\"key2\":\"value2\"}}"
| spath
| spath input=_raw path=details output=hold
| rex field=hold "\"(?<kvs>[^\"]*\"*[^\"]*\"*[^\"]*\"*)\"" max_match=0
| stats values(*) as * by kvs
| rex field=kvs "(?<key>[^\"]*)\":\"(?<value>[^\"]*)" max_match=0
| table orderNum key value orderLocation
If the value can be an escaped JSON string, then indeed you need to be more crafty with the regex. E.g.:
| makeresults
| eval _raw = "{\"orderNum\":\"1234\",\"orderLocation\":\"demoLoc\",\"details\":{\"key1\":\"{\\\"jsonvalue\\\":\\\"jsonvaluevalue\\\",\\\"jsonvalue2\\\":\\\"jsonvaluevalue2\\\"}\",\"key2\":\"value2\"}}"
| spath
| spath input=_raw path=details output=hold
| rex field=hold "(?<kvs>\"[^\"]*\":\"{?[^}]*}?\")" max_match=0
| stats values(*) as * by kvs
| rex field=kvs "(?<key>[^\"]*)\":\"(?<value>{?[^{}]*}?)\"" max_match=0
| table orderNum key value orderLocation
