Splunk Search

Unable to mask data with regex

cborchgrevink
Engager

Example Log:

CEF:0|WAF|SIEMintegration|1|1|Normal|0| fileId=989000730114151753 sourceServiceName=website.com postbody=first_name\=XXXXXX&last_name\=XXXX&shipping_first_name\=ABCDE&shipping_last_name\=EFGHI&record_number\=123412345

I am having trouble getting my regex in transforms.conf to mask:
1. shipping_first_name
2. shipping_last_name
3. record_number

Transforms.conf

[record-anonymizer]
REGEX = (?m)^(.*)record_number..\d{2,}$
FORMAT = $1rn=##
DEST_KEY = _raw

[first-name-anonymizer]
REGEX = (?m)^(.*)shipping_first_name..(\w{2,})&$
FORMAT = $1fn=##$
DEST_KEY = _raw

[last-name-anonymizer]
REGEX = (?m)^(.*)shipping_last_name..(\w{2,})&$
FORMAT = $1ln=##$
DEST_KEY = _raw

props.conf

[Test]
TRANSFORMS-anonymize = record-anonymizer, first-name-anonymizer, last-name-anonymizer
Tags (2)
0 Karma

tom_frotscher
Builder

Hi,

i tried to use your regex with regex101 and they do not match correctly. What you want, is a regex that captures everything up to the string you want to mask, and everything behind. Then, in the FORMAT field you use the first capture group, set your mask in the middle and use the secodn capture group.

Try this:

[record-anonymizer]
REGEX = (?m)^(.*)record_number\\\=\d+(.*)$
FORMAT = $1record_number=##$2
DEST_KEY = _raw

[first-name-anonymizer]
REGEX = (?m)^(.*)shipping_first_name\\\=\w+(&.*)$
FORMAT = $1shipping_first_name=##$2
DEST_KEY = _raw

[last-name-anonymizer]
REGEX = (?m)^(.*)shipping_last_name\\\=\w+(&.*)$
FORMAT = $1shipping_last_name=##$2
DEST_KEY = _raw

Greetings
Tom

pruthvikrishnap
Contributor

Try this

[my_sourctype]
[source::/path/to/my/logs]
SEDCMD-remove_secret_data = regex`enter code here
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...