Getting Data In

Transforms Truncating - Setting SOURCE_KEY

mmendez-opentec
Explorer

We are currently having an issue where our masking transforms are not working due to the length of _raw being too large. If we set LOOKAHEAD to a higher value the masking works.

_raw has request.body at the end of the event.

Since request.body is the only relevant part of the event from a transform perspective, we tried to set as the SOURCE_KEY, but it doesn't seem to do anything and there's no logs from what we can see.

 

Tried SOURCE_KEY = request.body or SOURCE_KEY = request
 
and tried with
 
[acceptable_keys]
request = request.body
 
or 
 
[acceptable_keys]
request = request
 

mmendezopentec_0-1756399062632.png

 

How do we use SOURCE_KEY to limit where the transforms regex is applying?

 

Labels (3)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

Yes. Edge processor seems to be the best shot here (anyway, manipulating structured data like json with regexes is risky).

View solution in original post

mmendez-opentec
Explorer

Okay thanks for the feedback

0 Karma

mmendez-opentec
Explorer

Thanks for the reply.

1 Is there a way to check this assumption? "your json event is in escaped text mode in disc."

There are a couple of options to make this work outside of Splunk, but are not ideal.

2 Maybe is there some way to index the request.body or set it to a be readable as a SOURCE_KEY in a performant way? Maybe some logic in the forwarder?

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Click > before event, it opens to you more information where you could see and click button "Event Actions", then select "Show Source".
I your event is not too long it open it as it is in disk. There you can see e.g. those escape marks etc.
0 Karma

PickleRick
SplunkTrust
SplunkTrust

SOURCE_KEY in case of index-time transforms requires indexed fields. You can't apply a transform to search-time extracted field because it doesn't exist in the indexing pipeline.

mmendez-opentec
Explorer

Okay, I think I should look up potentially making the request.body an indexed field.

Is this something that can be done in a performant way?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Don't do that. Indexed fields of high cardinality are not a good idea. Oh, and even if you wanted to modify an indexed field, it wouldn't change the raw data.

mmendez-opentec
Explorer

More info, our stanza in transforms.conf looks like

 

[ssn]
REGEX = (.*)((?i)\\"ssn\\":)(\s*)(\\")(\d+)(\\")(\s*)(.*)
FORMAT = ssn::"$5"
WRITE_META = true
SOURCE_KEY = request.body
 
[acceptable_keys]
request = request.body
0 Karma

isoutamo
SplunkTrust
SplunkTrust

I think that acceptable_keys didn't work as your json event is in escaped text mode in disc.

If you want to use event like json you must use INGEST_EVAL and json-functions. But I expecting that in that case you hit again a same limit to read that event in, convert it from escaped text to json and save again back to stream.

The best option is do this masking before ingestion with some other tool than Splunk's props and transforms.

Is it possible that you ask that source already mask it or can you use e.g. Ingest Action or Edge or Ingest Processor? Also one option is Cribl outside of Splunk world.

PickleRick
SplunkTrust
SplunkTrust

Yes. Edge processor seems to be the best shot here (anyway, manipulating structured data like json with regexes is risky).

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Take Action Automatically on Splunk Alerts with Red Hat Ansible Automation Platform

 Are you ready to revolutionize your IT operations? As digital transformation accelerates, the demand for ...

Calling All Security Pros: Ready to Race Through Boston?

Hey Splunkers, .conf25 is heading to Boston and we’re kicking things off with something bold, competitive, and ...

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...

Financial services organizations face an impossible equation: maintain 99.9% uptime for mission-critical ...