Splunk Search
Highlighted

How to anonymize data using REGEX in transforms.conf for an undefined number of characters?

Communicator

Hi,

I would like to anonymize data (data is file system path) using REGEX. I succesfully managed to hide data like IP, Credit Card Number, etc. But not able to replicate the setup for an undefined number of characters.

Could you please help reviewing the below code:

props.conf:

[amit_anonymize_data]
TRANSFORMS-anonymize = filepath-anonymizer

transforms.conf

[filepath-anonymizer]
REGEX = (?m)^(.*)filePath=\S+(.*)$
FORMAT = $1filePath=XXXX$2
DEST_KEY = _raw

Below an example of logs that must be transformed:

2016-02-25 14:40 GMT+1 this is only an example filePath="/tmp/file.log" error script 1

The log is indexed without any modification.

Thanks for your help.

Cyril

0 Karma
Highlighted

Re: How to anonymize data using REGEX in transforms.conf for an undefined number of characters?

SplunkTrust
SplunkTrust

What happens when you do this? Anything, or is the _raw unchanged?

And have you tried without multiline? (The (?m) at the front)? That may also be making it behave slightly differently.

0 Karma
Highlighted

Re: How to anonymize data using REGEX in transforms.conf for an undefined number of characters?

SplunkTrust
SplunkTrust

Is the sourcetype on the input set correctly (amitanonymizedata)?

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: How to anonymize data using REGEX in transforms.conf for an undefined number of characters?

Communicator

Yes the sourcetype is correct.

0 Karma
Highlighted

Re: How to anonymize data using REGEX in transforms.conf for an undefined number of characters?

Communicator

Yes _raw is unchanged. Just tried without (?m) but no success.

Is the FORMAT mentioned correct? My concern is about the number of char that XXXX replace. If the filePath has 15 characters, it will be replace by XXXX (4X) ? Is that right?

Thanks.

0 Karma
Highlighted

Re: How to anonymize data using REGEX in transforms.conf for an undefined number of characters?

SplunkTrust
SplunkTrust

The FORMAT string looks correct to me. Yes, the filepath will be replaced by 4 X's no matter how many characters are in the original path.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: How to anonymize data using REGEX in transforms.conf for an undefined number of characters?

SplunkTrust
SplunkTrust

Hi, please try this regex with positive lookahead and positive lookbehind.

Props.conf

[amit_anonymize_data]
TRANSFORMS-anonymize = filepath-anonymizer

Transforms.conf

[filepath-anonymizer]
REGEX = '(.*)(?<=filePath=").*(?=")(.*)'
FORMAT = $1XXXX$2
DEST_KEY = _raw

View solution in original post

Highlighted

Re: How to anonymize data using REGEX in transforms.conf for an undefined number of characters?

Communicator

No more success. From your input I also tried

(?<=filePath=")\S+(?=")

but no more success.

Can anything else impact it?

0 Karma
Highlighted

Re: How to anonymize data using REGEX in transforms.conf for an undefined number of characters?

SplunkTrust
SplunkTrust

My apologies. I have corrected my answer.

0 Karma
Highlighted

Re: How to anonymize data using REGEX in transforms.conf for an undefined number of characters?

Communicator

Unfortunately no change. I don't really know what's wrong...

0 Karma