Splunk Search

Extraction issue with dynamic field names

D2SI
Communicator

Hello there,

I am stuck with a dynamic field name extraction.

The data is partly JSON and sometimes contains nested JSON in the JSON part:

log-group=abc [2019-05-12 12:23:16,074] - INFO - {"time": "2019-05-12T12:23:16Z", "step": "PRE_REQUEST", "uuid": "abcxyz", "method": "GET", "ip_src": "1.2.3.4", "url": "https://api/abc", "url_params": {"name": "aaa", "reliability": "90", "equipment_name": "bbb", "element_name": "ccc"}, "user": "john"}

I am trying to extract each element of the nested 'url_params'.

To achieve this, I extract url_params as a JSON event and then I extract each of its field/value using dynamic field naming.

1st step - extracting url_params:

url_params_extract = {"name": "aaa", "reliability": "90", "equipment_name": "bbb", "element_name": "ccc"}

2nd step - extracting each element:

name = aaa
reliability = 90
equipment_name = bbb
element_name= ccc

The configuration files look like this:

transforms.conf

[url_params]
FORMAT = url_params_extract::$1
REGEX = url_params\"\:\s(\{.*?\})\,

[url_params_extract]
FORMAT = $1::$2
REGEX = \"(.+?)\"\:\s\"(.+?)\"
SOURCE_KEY = url_params_extract

props.conf

[test]
REPORT-url_params = url_params
REPORT-url_params_extract = url_params_extract

EVAL-url_params = null
EVAL-url_params_extract = nullif(url_params_extract, "{}")

The problem is each last element comes out with a closing curly bracket.

For instance

url_params_extract = {"name": "aaa", "reliability": "90", "equipment_name": "bbb", "element_name": "ccc"}

Result:

element_name = ccc"} 

Instead of desired:

element_name = ccc

Despite the regex being tested OK on regex101

Even more weird, if I do extract the nested JSON without curly braces, the issue remains:

url_params_extract = "name": "aaa", "reliability": "90", "equipment_name": "bbb", "element_name": "ccc"

I would still have:

element_name = ccc"}

Unfortunately, I am not able to reproduce the issue with this sample event, I am still try to figure out why.

But I am starting to think that I am missing something on '$1::$2' format usage.

Any hint ?

0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

Give this a try (updates to this transforms.conf entry, rest all will remain same)

[url_params_extract]
 FORMAT = $1::$2
 REGEX = \"(.+?)\"\:\s\"([^\"\}]+)\"
 SOURCE_KEY = url_params_extract

View solution in original post

somesoni2
SplunkTrust
SplunkTrust

Give this a try (updates to this transforms.conf entry, rest all will remain same)

[url_params_extract]
 FORMAT = $1::$2
 REGEX = \"(.+?)\"\:\s\"([^\"\}]+)\"
 SOURCE_KEY = url_params_extract

D2SI
Communicator

Thanks a lot!

I had already tried something the like so it did not resolve it directly but it helped me put the finger on what was wrong!

What was wrong was the extraction in place for the whole json part:

{"time": "2019-05-12T12:23:16Z", "step": "PRE_REQUEST", "uuid": "abcxyz", "method": "GET", "ip_src": "1.2.3.4", "url": "https://api/abc", "url_params": {"name": "aaa", "reliability": "90", "equipment_name": "bbb", "element_name": "ccc"}, "user": "john"}

It extracted the element in question - element_name = ccc"} - which was not overridden by what was executed after like I believe it would.

It even turned out that fixing the extraction following your suggestion allowed to get rid of the need to extract 'url_params' independently!

Thanks again,

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...