Knowledge Management

Spath calculated field limitations

Aatom
Explorer

Hey Splunkers!

We are running into an issue with an on-prem distributed deployment where the AWS feed is not extracting nested JSON fields at search time without the use of spath. We get first level and partial second level auto extraction, but it stops there. We need to normalize this data for functionality with friendly name alias's, and would like to avoid end users having to use spath with a long rename macro. yes, KV_MODE is set to JSON on the SH, IDX, and HF. no, we'd rather not perform indexed extractions. We've upped several limits and are unsure why it wouldn't just auto extract at searchtime. Please halp!

Here is the issue with using spath in calculated fields as a work around. I can calculate version and id consistently, but next level nested values with lists do not calculate to return fields at search-time.

 

works -> aws : EVAL-version version spath('BodyJson.Message', "version")

works -> aws : EVAL-id id spath('BodyJson.Message', "id")

doesn't work -> aws : EVAL-resources resources spath('BodyJson.Message', 'resources{}')

 

BodyJson: {
Message: {"version":"0","id":"-e154-88b-c","detail-type":"Findings - Imported","source":"aws.","account":"4724","time":"2021-01-13T20:09:26Z","region":"ca-central-1","resources":["arn:aws:ca"],"detail":{"findings":[{"ProductArn":"arn:aws:"...

 

What am I doing wrong here? Also, is there a known limitation on how many cycles of spath calculations the system will run on a specific field? Thanks in advance!

Labels (1)
0 Karma
1 Solution

to4kawa
Ultra Champion

try SEDCMD in props.conf ,also

SEDCMD-trim = s/\\\//g

View solution in original post

to4kawa
Ultra Champion
index=_internal | head 1 | fields _raw
| eval _raw="{\"BodyJson\":{\"Message\":{\"version\":\"0\",\"id\":\"-e154-88b-c\",\"detail-type\":\"Findings - \\\"Imported\\\"\",\"source\":\"aws.\",\"account\":\"4724\",\"time\":\"2021-01-13T20:09:26Z\",\"region\":\"ca-central-1\",\"resources\":[\"arn:aws:ca\"],\"detail\":{\"findings\":[{\"ProductArn\":\"arn:aws:\"}]}}}}"
| eval resources=json_extract(_raw,'BodyJson.Message','resources{}')

["arn:aws:ca"]
It seems to be extracted incorrectly.It's a bug.

 

index=_internal | head 1 | fields _raw
| eval _raw="{\"BodyJson\":{\"Message\":{\"version\":\"0\",\"id\":\"-e154-88b-c\",\"detail-type\":\"Findings - \\\"Imported\\\"\",\"source\":\"aws.\",\"account\":\"4724\",\"time\":\"2021-01-13T20:09:26Z\",\"region\":\"ca-central-1\",\"resources\":[\"arn:aws:ca\"],\"detail\":{\"findings\":[{\"ProductArn\":\"arn:aws:\"}]}}}}"
| rex mode=sed "s/resources\":\[(.*?)\]/resources\": \1/"
| spath

How about SEDCMD?

0 Karma

Aatom
Explorer

Thanks for the quick reply @to4kawa . I think this is part of the issue... The _raw output has the nested JSON objects escaping quotes with a backslash under Message. Is my best bet to setup a props/transforms on the SH to replace \" with " ? Are there any working examples you could point me towards? Thanks!

"BodyJson": {"Type": "Notification", "MessageId": "4f8b9202e", "TopicArn": "arn:aws:sns:ap-south-1:6679786758:events-ap-south-1", "Message": "{\"version\":\"0\",\"id\":\"0a880\",\"detail-type\":\"Findings - Imported\",\"source\":\"aws\",\"account\":\"56565\",\"time\":\"2021-01-19T20:26:38Z\",\"region\":\"ap-south-1\",\"resources\":[\"arn:aws:ap-south-1::product/aws/arn:aws:securityhub:ap-south-1:102707:subscription/v/1.2.0/1.6/finding/cb7ac3afd\"],\"detail\":....

0 Karma

to4kawa
Ultra Champion

try SEDCMD in props.conf ,also

SEDCMD-trim = s/\\\//g

Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...