I'm wondering, is it possible to mask / anonymize data at index time for the _internal index. I have an Alert Action configured with a webhook, and I'm looking to mask the URI of the request in internal logs.
I'm able to mask the value at search time with this SPL.
index=_internal action=webhook | rex field=url mode=sed "s/https?:\/\/www.domin.com\/(.*)/https:\/\/www.domain.com\/XXXX-XXXX-XXXX/g" | table url
I tried to port this configuration to /opt/splunk/etc/system/local/ by creating a props.conf with the following.
[sourcetype::_internal]
SEDCMD-url = s/https?:\/\/www.domain.com\/(.*)/https:\/\/www.domain.com\/XXXX-XXXX-XXXX/g
AND
[splunkd]
SEDCMD-url = s/https?:\/\/www.domain.com\/(.*)/https:\/\/www.domain.com\/XXXX-XXXX-XXXX/g
Doesn't work.
This is a standalone instance of Splunk running on a ec2 instance. So my question is, is it even possible to filter splunk generated logs? Should I funnel these to transforms.conf and do it there? Is that possible?
Any help or insight would be greatly appreciated
I have never try this, but basically it should work all other internal logs except _audit.
Try to use [splunkd] as a sourcetype or [source::…/var/log/splunk/splunkd*] based on which event you try to mask. You should remember that source definitions override sourcetype definitions.
BUT if you do this and you have any issues with splunk this probably gives a reason for splunk to denying full support to you before you remove that configuration.
While I do understand that compliance people (I suppose that's where the idea ultimately comes from) sometimes have their reasons, sometimes they are a bit overzealous.
Remember that _internal is - as the name says - Splunk's internal index. There should be only things relevant to Splunk's inner workings there. This index is not meant for non-admins access. So there should not be data there which is not obtainable by the admins anyway.
So while technically, you should be able to mask some data out of your events, it might make troubleshooting more difficult (also supportability point raised by @isoutamo is a very good one). You must also remember that parsing (and all associated activities like SEDCMD) are done on first heavy component in event's path so you'd need to place the props/transforms on the search-head(s) which is(are) generating those alerts. And this is a very unintuitive place to look for such settings in case someone inherits your environment in the future.
So while it is technically possible, I'd be hard pressed to call this a good idea.
I have never try this, but basically it should work all other internal logs except _audit.
Try to use [splunkd] as a sourcetype or [source::…/var/log/splunk/splunkd*] based on which event you try to mask. You should remember that source definitions override sourcetype definitions.
BUT if you do this and you have any issues with splunk this probably gives a reason for splunk to denying full support to you before you remove that configuration.
@isoutamo This worked perfectly! Thank you for your input. Seems the `source` monitor stanza was the way to go. Here is my final configuration for future Splunkers that want to accomplish the same.
[source::.../var/log/splunk/splunkd*]
SEDCMD-url = s/https?:\/\/www.domain.com\/(.*)/https:\/\/www.domain.com\/XXXX-XXXX-XXXX/g