Splunk Search

Extraction issue with dynamic field names

D2SI
Communicator

Hello there,

I am stuck with a dynamic field name extraction.

The data is partly JSON and sometimes contains nested JSON in the JSON part:

log-group=abc [2019-05-12 12:23:16,074] - INFO - {"time": "2019-05-12T12:23:16Z", "step": "PRE_REQUEST", "uuid": "abcxyz", "method": "GET", "ip_src": "1.2.3.4", "url": "https://api/abc", "url_params": {"name": "aaa", "reliability": "90", "equipment_name": "bbb", "element_name": "ccc"}, "user": "john"}

I am trying to extract each element of the nested 'url_params'.

To achieve this, I extract url_params as a JSON event and then I extract each of its field/value using dynamic field naming.

1st step - extracting url_params:

url_params_extract = {"name": "aaa", "reliability": "90", "equipment_name": "bbb", "element_name": "ccc"}

2nd step - extracting each element:

name = aaa
reliability = 90
equipment_name = bbb
element_name= ccc

The configuration files look like this:

transforms.conf

[url_params]
FORMAT = url_params_extract::$1
REGEX = url_params\"\:\s(\{.*?\})\,

[url_params_extract]
FORMAT = $1::$2
REGEX = \"(.+?)\"\:\s\"(.+?)\"
SOURCE_KEY = url_params_extract

props.conf

[test]
REPORT-url_params = url_params
REPORT-url_params_extract = url_params_extract

EVAL-url_params = null
EVAL-url_params_extract = nullif(url_params_extract, "{}")

The problem is each last element comes out with a closing curly bracket.

For instance

url_params_extract = {"name": "aaa", "reliability": "90", "equipment_name": "bbb", "element_name": "ccc"}

Result:

element_name = ccc"} 

Instead of desired:

element_name = ccc

Despite the regex being tested OK on regex101

Even more weird, if I do extract the nested JSON without curly braces, the issue remains:

url_params_extract = "name": "aaa", "reliability": "90", "equipment_name": "bbb", "element_name": "ccc"

I would still have:

element_name = ccc"}

Unfortunately, I am not able to reproduce the issue with this sample event, I am still try to figure out why.

But I am starting to think that I am missing something on '$1::$2' format usage.

Any hint ?

0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try (updates to this transforms.conf entry, rest all will remain same)

[url_params_extract]
 FORMAT = $1::$2
 REGEX = \"(.+?)\"\:\s\"([^\"\}]+)\"
 SOURCE_KEY = url_params_extract

View solution in original post

somesoni2
Revered Legend

Give this a try (updates to this transforms.conf entry, rest all will remain same)

[url_params_extract]
 FORMAT = $1::$2
 REGEX = \"(.+?)\"\:\s\"([^\"\}]+)\"
 SOURCE_KEY = url_params_extract

D2SI
Communicator

Thanks a lot!

I had already tried something the like so it did not resolve it directly but it helped me put the finger on what was wrong!

What was wrong was the extraction in place for the whole json part:

{"time": "2019-05-12T12:23:16Z", "step": "PRE_REQUEST", "uuid": "abcxyz", "method": "GET", "ip_src": "1.2.3.4", "url": "https://api/abc", "url_params": {"name": "aaa", "reliability": "90", "equipment_name": "bbb", "element_name": "ccc"}, "user": "john"}

It extracted the element in question - element_name = ccc"} - which was not overridden by what was executed after like I believe it would.

It even turned out that fixing the extraction following your suggestion allowed to get rid of the need to extract 'url_params' independently!

Thanks again,

0 Karma
Get Updates on the Splunk Community!

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...