PAN data in indexes. How to tackle them to avoid n...

koshyk · ‎07-15-2016

We have Splunk system collecting data from various sources (network, OS, application logs etc).
Unfortunately, some of these systems send PAN related data with unmasked credit card details, but we dont know where.
Is there a way to tackle these? We need to track they are sending PAN related data, but don't want to store that data (or store in hashed format).

My only thought is
- create an index pci_secure_index with permission only to restricted users
- Index all data normally. But run scheduled search to detect PAN information. Collect these data and summary index to "pci_secure_index"
- Delete (delete) from the original index

Is there a better approach?

(PS: We tried the anonymise data approach to search for cc pattern in first 5000 characters, but the system almost went down to knees)

gcusello · ‎07-15-2016

If you don't need real time, you could pre parse data with a script, and after index them in Splunk.
We did this for a customer that wanted to encrypt one field without lost it.
Bye.
Giuseppe

woodcock · ‎07-15-2016

If that's the case, deploy more indexers. I don't see any other ways.

gcusello · ‎07-15-2016

see this
http://docs.splunk.com/Documentation/Splunk/6.4.1/Data/Anonymizedata
Bye.
Giuseppe

koshyk · ‎07-15-2016

We tried the anonymise data approach to search for cc pattern in first 5000 characters, but the system almost went down to knees. The above link is good, if we are 100% sure or field where the PAN is coming. But incoming terrabytes of data with whole event scan is performance killer.

PAN data in indexes. How to tackle them to avoid non-compliance?

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!