Getting Data In

Looking for suggestions on how to mask email addresses that could be in almost any format in a JSON?

lycollicott
Motivator

I have a JSON with an agonizing amount of PII which is mostly email addresses, but it is in no standard format and no standard postion within the JSON. Here are just some examples of the format:

\"email\":\"mr.rogers@bubba.com\"
\\\"email\\\":\\\"mr.rogers@bubba.com\\\"
\"loginNameOrEmail\": \"mr.rogers@bubba.com\"
\\\"loginNameOrEmail\\\": \\\"mr.rogers@bubba.com\\\"

I need to mask this in props and transforms before it gets indexed and I need to somehow account for all formats both known and unknown.

0 Karma

bjcross
Explorer

In your props.conf for the source-type add a SEDCMD possibly like this.

SEDCMD-email = s/[\w!#$%&'+=?^_‘{|}~.-]+@(?:[\w!#$%&'+=?^_‘{|}~.-]+)*/XXXXX@EMAIL/g

https://docs.splunk.com/Documentation/Splunk/latest/Data/Anonymizedata#Anonymize_data_through_a_sed_...

0 Karma
Get Updates on the Splunk Community!

Index This | When is October more than just the tenth month?

October 2025 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

What’s New & Next in Splunk SOAR

 Security teams today are dealing with more alerts, more tools, and more pressure than ever.  Join us for an ...