Currently we use splunk cloud. I've added the following into the advanced section of the "Edit Source Type" to mask some fields coming in from auth0:
s/\"user_name\"\:\"[^\"]+\"/\"user_name\":\"###############\"/g
However, while this masks the data in the _raw json, it doesn't appear to mask the data in the data.user_name event dropdown. See below:
data.user_name john doe
##########
My question is, is a heavy forwarder setup necessary to mask this data in the events as well?
Thank you.
Masking in the HF is often easier because you (usually) have direct access to the file system making it easier to make changes. However, it is possible to mask data at index-time from the Splunk Cloud UI.
You'll do that by creating a transform that masks the data and then referencing that transform from the appropriate sourcetype. This is exactly how we do it in on-prem systems. In the Splunk Cloud world, those props and transforms are put into a custom app and uploaded.
If you don't want to bother with an app, you can use the GUI. Go to Settings->Fields->Field Transformations to define the masking transform. Then go to the advanced section of the "Edit Source Type" and add a TRANSFORMS setting to invoke that transform. Splunk Cloud will magically apply the settings to the indexers and search heads.
Thank you for this. I've added:
s/\"user_name\"\:\"[^\"]+\"/\"user_name\":\"###############\"/g
into the "edit source type" advanced section and it doesn't appear to mask the values as expected. Is there something wrong with my regex here? I tested it in the search&reporting section and it works there but at index time it doesn't appear t work.
Also, would it be an better option to place this in the Settings->Fields->Field Transformations section?
The Advanced section of Edit Source Type calls for two fields, but you provided only one. The value in your reply should go in the right box - what did you put in the left box? It should have been something like SEDCMD-foo. Also, settings put in this page of the UI apply at search time rather than index time.
To change data at index-time, the setting must be in a transform and that transform must be mentioned in a TRANSFORMS-foo setting in Edit Source Type.
Thank you. On the left hand side, i have SEDCMD-maskusername. Based on what i have read, this should mask things at index time. Which it has. However, is my regex off? At times when it masks the data, it pulls in the \"user_name\":"\xxxxxx\" and i'm not quite sure why it's doing that.
It's impossible to evaluate a regex without sample data. If you don't want to share sample data here then use a site like regex101.com to test your regex against some of your data.
FWIW, the backslashes in the regex are unnecessary.
@ballen1 - I'm presuming that this is JSON data type.
This could be because you are using INDEXED_EXTRACTION. If you use search-time extraction with KV_MODE=json then you will see masked value only.
(There are many unknowns here, so I'm giving an idea based on what I think most possible scenario based on my assumptions here.)
I hope this helps!!!