Getting Data In
Highlighted

How to mask sensitive data at index time?

Engager

I am trying to mask PII data at index time. Here is an example of PII data I am trying to mask:

RecipientSSNxxx-xx-4321RecipientSSN

I am able to mask it at search time using this

        source= mysource 
        | rex "(?RecipientSSN\d{3}\-\d{2}\-\d{4})" 
        | rex field=RecipientSSN mode=sed "s/\d{3}-\d{2}/XXX-XX/g"

However, I need it to masked at index time. I have tried the following in props.conf and transforms.conf (system\local for both):

props.conf

[nsb_message]
TRANSFORMS-anonymize = ssn-anonymizer

transforms.conf

[ssn-anonymizer]
regex = (\d{3}\-\d{2}\-)(\d{4})
FORMAT= $1XXX-XX-$2
DEST_KEY = _raw

I have restarted Splunk, input new test files via index file monitors one-time, and the SSN is still not masked. Any help would be appreciated. I verified that the sourcetype does exist in the inputs.conf (system\local) as well.

Any help or pointers would be greatly appreciated!

Highlighted

Re: How to mask sensitive data at index time?

Champion

How about following the simple SED example here: https://docs.splunk.com/Documentation/Splunk/latest/Data/Anonymizedata#Anonymize_data_through_a_sed_...

in props.conf

[nsb_message]
SEDCMD-ssn_anon = s/RecipientSSN(\d{3}-\d{2}-)(\d{4})/RecipientSSNXXX-XX-\2/g  

View solution in original post

Highlighted

Re: How to mask sensitive data at index time?

Engager

That worked! Thanks rjthibod!

0 Karma
Highlighted

Re: How to mask sensitive data at index time?

Contributor

SEDCMD- , Is this class name user-defined?

0 Karma
Highlighted

Re: How to mask sensitive data at index time?

Communicator

From doc: Any text after SEDCMD- can be any string that helps you identify what the transformation script does. The clause must exist because it and the SEDCMD stem form the class name for the script

0 Karma