Splunk Search

Cannot transform string with regex

scottkurtosys
New Member

Hi

I am trying to transform a couple of strings that are being capture in my Splunk logs

The string are similar to this

{"Key":"Authorization","Value":["Basic EAAAALhzFAxssvST1j4jBCAynyb3F9kHsHFWvijwNkuBb3pnY0zFtrz61YPlxQkP73l9p9ZusdBBfjSrDXgueEipT8xUuRk3tFPIAnmwFbGxluvRa3szorgtEq6VDXuIZL9RgA=="]},{"Key":"Authorization-Token","Value":["BCDC62F494410A7ABAE80457C9566F37"]}]

I have tested the following regex expressions with a couple of tools, and they seem to match

"Authorization","Value":\["(Basic)\s[a-zA-Z0-9+\/]+={0,2}"

"Authorization-Token","Value":\["[a-zA-Z0-9+]+"

I have the following in my $SPLUNK_HOME/etc/system/local/props.conf file

[someapp]
TRANSFORMS-anonymize = authorization-anonymizer, authorization-token-anonymizer

And the following in my $SPLUNK_HOME/etc/system/local/transforms.conf file

`[authorization-anonymizer]
REGEX = "Authorization","Value":["(Basic)\s[a-zA-Z0-9+\/]+={0,2}"
FORMAT = $1"Value":["Basic ##############################################################################################################################$2 DEST_KEY = _raw

[authorization-token-anonymizer]
REGEX= "Authorization-Token","Value":["[a-zA-Z0-9+]+"
FORMAT = $1"Value":["############################$2
DEST_KEY = _raw`

The intention is to replace the strings with # characters, but I clearly have misunderstood something, as the strings are not changing

Could anyone help at all ?

Thanks

_scott

0 Karma
1 Solution

FrankVl
Ultra Champion

You're using $1 and $2 in your FORMAT values, while the first regex has only 1 capturing group and the second has none. So that doesn't line up, which is probably why these transforms are not getting applied.

I think you need to adjust your regexes, such that you're capturing the parts before and after the string that needs to be anonymized and then specify a format like $1#####$2.

View solution in original post

0 Karma

somesoni2
Revered Legend

Give this a try (transforms.conf)

[authorization-anonymizer] 
REGEX =(?m)^(.*"Authorization","Value":\["Basic\s*)[^\"]+(.+)
FORMAT = $1####################$2 
DEST_KEY = _raw 

[authorization-token-anonymizer] 
REGEX =(?m)^(.*"Authorization-Token","Value":\[")[^\"]+(.+)
FORMAT = $1####################$2 
DEST_KEY = _raw
0 Karma

FrankVl
Ultra Champion

You're using $1 and $2 in your FORMAT values, while the first regex has only 1 capturing group and the second has none. So that doesn't line up, which is probably why these transforms are not getting applied.

I think you need to adjust your regexes, such that you're capturing the parts before and after the string that needs to be anonymized and then specify a format like $1#####$2.

0 Karma

scottkurtosys
New Member

So if I were attempt to use something like this

(."Authorization","Value":["Basic\s)(.={1,2})("]},{"Key":"Authorization-Token","Value":[")(.{32})(.*)

Where each () capture group matches sections of the whole

Could I then use a FORMAT of $1 ##### $3 ##### $5

To hash out the two strings all in a single transform ?

Or am I still misunderstanding the capture groups and FORMAT statement ?

Also, do quote marks need to be escaped in Splunk regexes ?

Thanks

0 Karma

FrankVl
Ultra Champion

Yes, something like that should work. Although there is not much purpose for putting the parts you don't want to keep in a capture group.

0 Karma

scottkurtosys
New Member

Thanks for pointing me in the right direction. Have got it working now

🐵

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...