Hi Splunkers,
I want to mask the PII data during the search time for specific users.
I checked all the existing questions and answers, but was not able to achieve it. I tried the below option,
my search|rex field=_raw "(?.*)TYPE\s\[PHONE\]\s*\[\+\w{12}\](?.*)"
| eval _raw=head."TYPE [PHONE] [############] ".tail
Any help is appreciated.
Thanks,
Purush
YOu can find the search time data masking methods here (see accepted answer and comment below it)
https://answers.splunk.com/answers/234919/search-time-data-masking.html
The problem is that since it's done by search time, underlying raw data still have those PII data. So, anyone can run a basic search with "Fast Mode" to disable this masking and see the original data. In cases like this, we do one of following (along with working with owner to mast the PII at the source OR do the mask at index time):
1) Delete the current data with PII and re-index it. (causes duplicate license usage)
2) Move the to a summary index (using summary index search OR using collect command), with the summary search doing search time masking. Once moved, delete the original PII data.
Thank you for your answer!
I don't want to re-index because of large volume. For Summary Indexing, no need to re-index, just run the job and save the metrics to the summary index and do not give the access to the original index., that way we restricted the user to see the original data. Due to too many types of data and jobs, we are not moving towards it.
I agreed that if we save as a macro and if the user knows the base query, still he/she can see the data.
For this option, I tried to restrict the access to the only Macro but it didn't work.
Is your rex command working fine?
yes, perfectly working in all the options. I verified with regex online editor also. The problem with the first option is that I am able to give macro in the restrict search terms for a role but when I search as a user belongs to that role, no data is populating.
you need to use mode=sed
see here:
https://docs.splunk.com/Documentation/Splunk/7.2.5/SearchReference/Rex
also, you probably would like to add it as a search filter to the role the users belong too.
note: imho its not a sustainable solution. better way would be to put masked data in a summary index and allow users to see summarized results only
Yes, Summary indexing is the best solution which we have already implemented for one part of data but the problem with summary indexing is that you need to run a job for every 5 mins to make it lively and I have different types of data and our alerts are running extensively, so if we run the summary indexing job every 5 mins which may not be effective for our environment.
PS: I am going to think in this direction of how much efficiently I can use SI in our environment.