All Apps and Add-ons

Sanitize already indexed data

mslvrstn
Communicator

There is some data that we want to sanitize in Splunk. I've already got a SEDCMD to do it for newly indexed data, but is there some way to modify the events that have already been indexed in Splunk. At worst, I will delete the events, but ideally I would like to just XXX out a specific field.

Tags (2)

davidpaper
Contributor

Hi,

Data in Splunk is indeed immutable. This doesn't mean that with a little work, that the data can't be cleaned up and made available for search without the PII data in there.

0) You already nailed part of the solution: SEDCMD to keep the problem from getting worse for new data indexed.
1) Create a search that finds all the events with the PII data in it that needs to be cleansed. Run that on the ./splunk CLI and dump the results to a file.
2) Use your favorite text mangling tools (sed, awk, perl, LISP 🙂 ) to sanitize the data on disk.
3) Run the original search again, this time with '... | delete' at the end, to mark the existing entries as unavailable for being included in search results.
4) Use ./splunk add oneshot to re-index the sanitized data file.

5) enjoy a frosty beverage of your choice for a job well done.

The original search & |delete may take quite a while, depending on how many events need to be found & extracted. The oneshot will slurp it back in as fast as the forwarder/index can absorb it. Note that oneshot WILL count against the license, so plan accordingly.

0 Karma

Jason
Motivator

As far as I know, once it is indexed, it is immutable. You can restrict access to the data via a role's search strings, and you can use | rex mode=sed ... to hide data at search time. Perhaps combine both to enforce a sed for a particular role?

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...