Getting Data In

Run the equivalent of an `extract` command on a structured JSON event's subfield

ckarcher
New Member

We're ingesting structured JSON logs from a source and would like to run the equivalent of the extract command on one of the event's sub fields. The events look something like this:

{
    "field1":"value1",
    "field2":"value2",
    "field3":"value3",
    "msg":"field4=value4 field5=value5 field6=value6"
}

The top level field1/field2/field3/msg fields are all being extracted as expected. However, we'd also like to extract arbitrary key/value pairs defined in the msg field, ideally at index time so that they're available to all searches. The key/value pairs that exist in the msg field are not known beforehand. Is it possible to still extract them at index time and make them available to searches?

We've been able to achieve the desired result with a search command chain like the following:

...base search...
| rename _raw AS _temp 
| rename msg AS _raw 
| extract pairdelim="?&" kvdelim="=" 
| rename _raw AS msg 
| rename _temp AS _raw

However, we have some dashboards that run lots of searches, and we don't want to hack the above command chain into every individual search query.

0 Karma

ckarcher
New Member

I was able to solve this by creating two field transforms like the following that handle the case where the values are in quotes (e.g., key1="value1 with spaces") as well as the case where they aren't (e.g., key1=value1withoutspaces).

json_msg_transform_with_quotes
(?P<_KEY_1>\w+)="(?P<_VAL_1>[^"]*)"

json_msg_transform_without_quotes
(?P<_KEY_1>\w+)=(?P<_VAL_1>[^"\s]+)

I then wired up two new field extractions that use those transforms on the desired source type, and I'm now seeing all the fields (both those from the raw JSON event as well as those embedded in the msg field) available at query time.

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@ckarcher,

Can you please try by adding below configurations in props.conf?

File path: SPLUNK_HOME/etc/apps/YOUR_APP/local/props.conf

[YOUR_SOURCETYPE]
EXTRACT-field4,field5,field6 = ^[^=\n]*=(?P<field4>\w+)[^=\n]*=(?P<field5>\w+)[^=\n]*=(?P<field6>\w+)

Note: You may need to update the regular expression as per your events/requirement.

Thanks

0 Karma

ckarcher
New Member

Per the original post, the names of the key/value pairs in the msg field are arbitrary and unknown beforehand.

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@ckarcher,

You can try this also:

| makeresults | eval _raw="{\"field1\":\"value1\",\"field2\":\"value2\",\"field3\":\"value3\",\"msg\":\"field4=value4 field5=value5 field6=value6\"}" | extract | eval _raw=msg | extract
0 Karma

ckarcher
New Member

Hi @kamlesh_vaghela - we've already proven that it's possible to extract the K/V pairs from msg at search time with an extract command like you've provided. However, we have dashboards with lots of searches in them, and we want to avoid hacking the rename + extract command into each of them. Do you know if it's possible to do this in a way that works for all searches against a given source type?

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@ckarcher,

please check my below answer.

0 Karma

poete
Builder

Hello @ckarcher,
In case the format of msg does not change, you can use rex, as below

| makeresults 
| eval _raw="{\"field1\":\"value1\",\"field2\":\"value2\",\"field3\":\"value3\",\"msg\":\"field4=value4 field5=value5 field6=value6\"}"
| spath
| rex field=msg "field4=(?<field4>.*) field5=(?<field5>.*) field6=(?<field6>.*)"
0 Karma

ckarcher
New Member

Hi @poete - the format of the msg field is unknown beforehand. It may contain any number of arbitrary key/value pairs, and we want to extract them all. I've updated the question to reflect this.

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...