Getting Data In

Looking for suggestions on how to mask email addresses that could be in almost any format in a JSON?

lycollicott
Motivator

I have a JSON with an agonizing amount of PII which is mostly email addresses, but it is in no standard format and no standard postion within the JSON. Here are just some examples of the format:

\"email\":\"mr.rogers@bubba.com\"
\\\"email\\\":\\\"mr.rogers@bubba.com\\\"
\"loginNameOrEmail\": \"mr.rogers@bubba.com\"
\\\"loginNameOrEmail\\\": \\\"mr.rogers@bubba.com\\\"

I need to mask this in props and transforms before it gets indexed and I need to somehow account for all formats both known and unknown.

0 Karma

bjcross
Explorer

In your props.conf for the source-type add a SEDCMD possibly like this.

SEDCMD-email = s/[\w!#$%&'+=?^_‘{|}~.-]+@(?:[\w!#$%&'+=?^_‘{|}~.-]+)*/XXXXX@EMAIL/g

https://docs.splunk.com/Documentation/Splunk/latest/Data/Anonymizedata#Anonymize_data_through_a_sed_...

0 Karma
Get Updates on the Splunk Community!

Sending Metrics to Splunk Enterprise With the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...

What's New in Splunk Cloud Platform 9.0.2208?!

Howdy!  We are happy to share the newest updates in Splunk Cloud Platform 9.0.2208! Analysts can benefit ...

Want a chance to win $500 to the Splunk shop? Take our IT Incident Management Survey!

  Top Trends & Best Practices in Incident ManagementSplunk is partnering up with Constellation Research to ...