Getting Data In

How to mask SSN into our logs going into Splunk?

ronerf
Explorer

Our code leaked SSNs into our logs and they went into Splunk, so i'm trying to mask it. I tried it two ways (BTW, the regex works when i use it with | regex _raw=😞

  1. In etc/system/local/props.conf:

[source::/var/www/app/shared/log/production.log]
SEDCMD-ssn = s/(social_security_number..:..)\d{9}/\1[FILTERED]/g

  1. In etc/system/local/props.conf:

[source::/var/www/app/shared/log/production.log]
TRANSFORMS-ssn = ssn_mask

and etc/system/local/transforms.conf:

[ssn_mask]
DEST_KEY = _raw
REGEX = (social_security_number..:..)\d{9}
FORMAT = $1[FILTERED]

Neither works. What am I missing? This is on 6.5.0.

0 Karma
1 Solution

yannK
Splunk Employee
Splunk Employee

The code that generates the logs has been corrected to filter the SSNs, so the goal is to mask the logs that have already been indexed in splunk.

then no, you cannot safely hide the SSN from the events at search time, as they are in the raw data.

The solution is to create a search that will find all the events with SSN, and use the " | delete" command to mark them as delete on the buckets. (may be more tricky on an indexer cluster)
http://docs.splunk.com/Documentation/Splunk/latest/Indexer/RemovedatafromSplunk#Delete_events_from_s...

View solution in original post

yannK
Splunk Employee
Splunk Employee

The code that generates the logs has been corrected to filter the SSNs, so the goal is to mask the logs that have already been indexed in splunk.

then no, you cannot safely hide the SSN from the events at search time, as they are in the raw data.

The solution is to create a search that will find all the events with SSN, and use the " | delete" command to mark them as delete on the buckets. (may be more tricky on an indexer cluster)
http://docs.splunk.com/Documentation/Splunk/latest/Indexer/RemovedatafromSplunk#Delete_events_from_s...

ronerf
Explorer

This is the relevant part of the JSON blob:

{"params":{"{\"applicants\":{\"primary\":{\"social_security_number\":\"SSNNUMBER\"}}}":"[FILTERED]"}}

SSNNUMBERis a 9-digit number.

0 Karma

sudosplunk
Motivator

Based on sample data, your SEDCMD setting should be adjusted a little. Below is the modified version. Give it a try...
SEDCMD-ssn = s/(social_security_number..:..)\d{9}(\\")/\1xxxxxxxxx\2/g

Please note that data can't be modified once indexed. This mask will be effected to new events.

You can find some information here: https://answers.splunk.com/answers/22835/how-can-we-anonymize-user-date-at-search-time.html

0 Karma

ronerf
Explorer

Thanks, everyone.

0 Karma

ronerf
Explorer

The code that generates the logs has been corrected to filter the SSNs, so the goal is to mask the logs that have already been indexed in splunk.

0 Karma

somesoni2
Revered Legend

Data that already ingested, can't be modified. Your masking configuration will only work on any new event that would come. I believe your only option would be to delete those events, so that they are not searchable anymore. If you still want other fields/data from those events, you can mask the data at search time (inline in search) and do summary indexing to save those records into different index before deleting them.

0 Karma

sudosplunk
Motivator

Hi ronerf,

Your configurations looks good. Can you provide sample event(s) to see why these configurations doesn't work. Also, please remember that these configurations should be applied to both source and destination of the data, which means in a typical deployment, configs should be present on universal forwarders, heavy forwarders (if you're using this) and indexers.

0 Karma

yannK
Splunk Employee
Splunk Employee

yes, please provide a few samples of sanitized SSN and the event around.

Also the transforms are happening at index time, therefore they have to be setup
on the first server parsing the events.

  • for regular logs, this means the indexers, or the first heavy forwarder of the chain (if any)
  • for structured logs (INDEXED_EXTRACTIONS=csv or json,,,), this means on the first forwarder who collected the logs (this may be the Universal forwarder)
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...