Getting Data In

How can we extract a json document within an event?

ddrillic
Ultra Champion

We have events such as -

10.10.2017 09:40:39.651 *INFO* [10.86.208.119 [1507646439651] POST /apps/xxxx/yyyy HTTP/1.1] com.xxxx.yyyy.api.impl.logging.info.InfoLoggerServiceImpl {"id":{"access_token":"7ee2ea18-e72c-449d-9dec-28d02b116c92","uid":"zzzzz","jsessionID":"aaaaaaa","uuid":"12e255ac-35e9-4630-a36b-89aa27e9566e"},"request":{"url":"https://bbb.cccc.com/content/uuuuuu"..... }]}}

The json document is part of the event. Can we extract this json document?

Tags (1)
0 Karma
1 Solution

sshelly_splunk
Splunk Employee
Splunk Employee

I took a quick look at this, and I think this transforms might work for you. This will not get the "id" or "request" fields, as I am not sure what they are. This did get the following: access_token, uid, jsessionID, uuid and url.
In props, I added: REPORT-extract = json_embedded
The transforms stanza is:
[json_embedded]
REGEX = "(\w+)"."(\S+?)"
FORMAT = $1::$2

Hope this helps. Reply if it does not.

View solution in original post

sshelly_splunk
Splunk Employee
Splunk Employee

I took a quick look at this, and I think this transforms might work for you. This will not get the "id" or "request" fields, as I am not sure what they are. This did get the following: access_token, uid, jsessionID, uuid and url.
In props, I added: REPORT-extract = json_embedded
The transforms stanza is:
[json_embedded]
REGEX = "(\w+)"."(\S+?)"
FORMAT = $1::$2

Hope this helps. Reply if it does not.

ddrillic
Ultra Champion

Just applied it and it works perfectly - much appreciated. Just wondering if there is anything like the spath command that we use for XML documents for json documents, so we can reach nested elements ...

0 Karma

sshelly_splunk
Splunk Employee
Splunk Employee

ddrillic - You can index just the json portion of the event, but it looks like the text before the json portion includes timestamp, etc. Since this log is not proper json, I think you're going to need to do regex on it for display purposes.

When looking at xml or json data (assuming it conforms to standards - sorry not exactly sure what that all entails:)), you can use kvmode=xml or json, or use something like the above. My skills are really around getting data in, and not SPL proper (I know, I know:)), so I will defer to the SPL experts for the spl-specific questions, but my focus is really on making sure data comes in correctly, so the SPL doesn't need to be complex to get value out of the data. Sorry if that doesnt help.

ddrillic
Ultra Champion

Very interesting, so you are saying that if it's a "real" json document we can parse it as such - interesting.

0 Karma

ddrillic
Ultra Champion

Beautiful thing!!! I wanted to ask for a while - is there a way to test these configurations somehow from the search interface before adding these configurations to the config files?

0 Karma

blacknight659
Explorer

You could take a raw copy of the logs and use the UI to upload and test the event breaking and extraction. I think Splunk really likes Json since it auto extracts the fields and values.

sshelly_splunk
Splunk Employee
Splunk Employee

I use regex101 to test all of my transforms (unless they are extremely simple:)).
Copy 2 events if available into the "Test String" window, and go to town.

sshelly_splunk
Splunk Employee
Splunk Employee

sorry - just re-read your question. I test regex in the search bar sometimes, but not usually. Slightly different format, etc, so I use regex101, but might be just a preference.

0 Karma

ddrillic
Ultra Champion

Perfect, so I got the REGEX part. What does the FORMAT - $1::$2 mean?

0 Karma

ddrillic
Ultra Champion

For future reference -

FORMAT = $1::$2 (where the REGEX extracts both the field name and the field value)

from Create custom fields at index time

0 Karma

sbbadri
Motivator

https://regex101.com/r/FPxKuU/1

or

| makeresults | eval test="10.10.2017 09:40:39.651 INFO [10.86.208.119 [1507646439651] POST /apps/xxxx/yyyy HTTP/1.1] com.xxxx.yyyy.api.impl.logging.info.InfoLoggerServiceImpl {\"id\":{\"access_token\":\"7ee2ea18-e72c-449d-9dec-28d02b116c92\",\"uid\":\"zzzzz\",\"jsessionID\":\"aaaaaaa\",\"uuid\":\"12e255ac-35e9-4630-a36b-89aa27e9566e\"},\"request\":{\"url\":\"https://bbb.cccc.com/content/uuuuuu\"..... }]}}" | rex field=test "(?P\"(\w+)\".\"(\S+))\""

ddrillic
Ultra Champion

Wow - man. very pretty!!!

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...