My team has a number of index-time sedcmd-based password masking rules for words in known positions of passwords. This strategy has worked well for us for a while. We are currently wrestling with a case where users accidentally include their passwords along with their upn (user@domain) in the userid field of a windows logon. Does anyone have a good way to handle this condition?
Similarly, does anyone know of any projects to curate lists of trusted splunk transforms for sensitive data masking?
From the documentation side I see the following Anonymize data
There's no accounting for stupidity.
Perhaps, however, if the domain portion of the upn is well-known, you can mask everything that follows it.
That was might thought as well... some sort of negative look-behind. I thought I was good at regex until I tried to mask passwords with low fp rate.