Getting Data In

Looking for suggestions on how to mask email addresses that could be in almost any format in a JSON?

lycollicott
Motivator

I have a JSON with an agonizing amount of PII which is mostly email addresses, but it is in no standard format and no standard postion within the JSON. Here are just some examples of the format:

\"email\":\"mr.rogers@bubba.com\"
\\\"email\\\":\\\"mr.rogers@bubba.com\\\"
\"loginNameOrEmail\": \"mr.rogers@bubba.com\"
\\\"loginNameOrEmail\\\": \\\"mr.rogers@bubba.com\\\"

I need to mask this in props and transforms before it gets indexed and I need to somehow account for all formats both known and unknown.

0 Karma

bjcross
Explorer

In your props.conf for the source-type add a SEDCMD possibly like this.

SEDCMD-email = s/[\w!#$%&'+=?^_‘{|}~.-]+@(?:[\w!#$%&'+=?^_‘{|}~.-]+)*/XXXXX@EMAIL/g

https://docs.splunk.com/Documentation/Splunk/latest/Data/Anonymizedata#Anonymize_data_through_a_sed_...

0 Karma
Get Updates on the Splunk Community!

Harnessing Splunk’s Federated Search for Amazon S3

Managing your data effectively often means balancing performance, costs, and compliance. Splunk’s Federated ...

Infographic provides the TL;DR for the 2024 Splunk Career Impact Report

We’ve been buzzing with excitement about the recent validation of Splunk Education! The 2024 Splunk Career ...

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...