Knowledge Management

Spath calculated field limitations

Aatom
Explorer

Hey Splunkers!

We are running into an issue with an on-prem distributed deployment where the AWS feed is not extracting nested JSON fields at search time without the use of spath. We get first level and partial second level auto extraction, but it stops there. We need to normalize this data for functionality with friendly name alias's, and would like to avoid end users having to use spath with a long rename macro. yes, KV_MODE is set to JSON on the SH, IDX, and HF. no, we'd rather not perform indexed extractions. We've upped several limits and are unsure why it wouldn't just auto extract at searchtime. Please halp!

Here is the issue with using spath in calculated fields as a work around. I can calculate version and id consistently, but next level nested values with lists do not calculate to return fields at search-time.

 

works -> aws : EVAL-version version spath('BodyJson.Message', "version")

works -> aws : EVAL-id id spath('BodyJson.Message', "id")

doesn't work -> aws : EVAL-resources resources spath('BodyJson.Message', 'resources{}')

 

BodyJson: {
Message: {"version":"0","id":"-e154-88b-c","detail-type":"Findings - Imported","source":"aws.","account":"4724","time":"2021-01-13T20:09:26Z","region":"ca-central-1","resources":["arn:aws:ca"],"detail":{"findings":[{"ProductArn":"arn:aws:"...

 

What am I doing wrong here? Also, is there a known limitation on how many cycles of spath calculations the system will run on a specific field? Thanks in advance!

Labels (1)
0 Karma
1 Solution

to4kawa
SplunkTrust
SplunkTrust

try SEDCMD in props.conf ,also

SEDCMD-trim = s/\\\//g

View solution in original post

to4kawa
SplunkTrust
SplunkTrust
index=_internal | head 1 | fields _raw
| eval _raw="{\"BodyJson\":{\"Message\":{\"version\":\"0\",\"id\":\"-e154-88b-c\",\"detail-type\":\"Findings - \\\"Imported\\\"\",\"source\":\"aws.\",\"account\":\"4724\",\"time\":\"2021-01-13T20:09:26Z\",\"region\":\"ca-central-1\",\"resources\":[\"arn:aws:ca\"],\"detail\":{\"findings\":[{\"ProductArn\":\"arn:aws:\"}]}}}}"
| eval resources=json_extract(_raw,'BodyJson.Message','resources{}')

["arn:aws:ca"]
It seems to be extracted incorrectly.It's a bug.

 

index=_internal | head 1 | fields _raw
| eval _raw="{\"BodyJson\":{\"Message\":{\"version\":\"0\",\"id\":\"-e154-88b-c\",\"detail-type\":\"Findings - \\\"Imported\\\"\",\"source\":\"aws.\",\"account\":\"4724\",\"time\":\"2021-01-13T20:09:26Z\",\"region\":\"ca-central-1\",\"resources\":[\"arn:aws:ca\"],\"detail\":{\"findings\":[{\"ProductArn\":\"arn:aws:\"}]}}}}"
| rex mode=sed "s/resources\":\[(.*?)\]/resources\": \1/"
| spath

How about SEDCMD?

0 Karma

Aatom
Explorer

Thanks for the quick reply @to4kawa . I think this is part of the issue... The _raw output has the nested JSON objects escaping quotes with a backslash under Message. Is my best bet to setup a props/transforms on the SH to replace \" with " ? Are there any working examples you could point me towards? Thanks!

"BodyJson": {"Type": "Notification", "MessageId": "4f8b9202e", "TopicArn": "arn:aws:sns:ap-south-1:6679786758:events-ap-south-1", "Message": "{\"version\":\"0\",\"id\":\"0a880\",\"detail-type\":\"Findings - Imported\",\"source\":\"aws\",\"account\":\"56565\",\"time\":\"2021-01-19T20:26:38Z\",\"region\":\"ap-south-1\",\"resources\":[\"arn:aws:ap-south-1::product/aws/arn:aws:securityhub:ap-south-1:102707:subscription/v/1.2.0/1.6/finding/cb7ac3afd\"],\"detail\":....

0 Karma

to4kawa
SplunkTrust
SplunkTrust

try SEDCMD in props.conf ,also

SEDCMD-trim = s/\\\//g

View solution in original post

Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!