Getting Data In

Looking for suggestions on how to mask email addresses that could be in almost any format in a JSON?

lycollicott
Motivator

I have a JSON with an agonizing amount of PII which is mostly email addresses, but it is in no standard format and no standard postion within the JSON. Here are just some examples of the format:

\"email\":\"mr.rogers@bubba.com\"
\\\"email\\\":\\\"mr.rogers@bubba.com\\\"
\"loginNameOrEmail\": \"mr.rogers@bubba.com\"
\\\"loginNameOrEmail\\\": \\\"mr.rogers@bubba.com\\\"

I need to mask this in props and transforms before it gets indexed and I need to somehow account for all formats both known and unknown.

0 Karma

bjcross
Explorer

In your props.conf for the source-type add a SEDCMD possibly like this.

SEDCMD-email = s/[\w!#$%&'+=?^_‘{|}~.-]+@(?:[\w!#$%&'+=?^_‘{|}~.-]+)*/XXXXX@EMAIL/g

https://docs.splunk.com/Documentation/Splunk/latest/Data/Anonymizedata#Anonymize_data_through_a_sed_...

0 Karma
Get Updates on the Splunk Community!

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...