All,
I have never seen a docs or Conf talk or anything for this I guess it doesn't exist but thought I would ask anyway, just in case it's some feature I somehow missed.
Basically we have email addresses and some other PII coming into a small instance of Splunk segmented from the main one. Boss wants the data coming into Splunk tokenized and detokenized based on Splunk user role.
Anything like that available?
You are correct, that feature as described does not exist in core Splunk. I would definitely suggest logging an Enhancement Request on your company's support entitlement if you have a need for such a feature.
Something that you may be able to do, would be to index the clear text data on one instance, and forward the data or otherwise clone the stream and use regular expressions to remove the sensitive data prior to indexing on a different instance. (Yes I'm hand waving at that a little bit right now, as depending on architecture, networks, and the nature of the data, this can be more difficult than it sounds)... But then only certain users would be able to log into the sensitive instance, and more users would be able to log into the non-sensitive instance... but if a user who can see sensitive data logs into the non-sensitive instance, they would only see the stripped non-sensitive version of the logs. (Which again, depending on the kinds of data, and your company's risk models around such data, this might actually be the correct answer...)
But selecting fields, and replacing them with encrypted tokens at index time, that can be decrypted with a special search command at search time based on your role... I've seen that mocked up as part of a demo in a third party startup's conceptual product, which is not yet in alpha testing. I've also seen some early prototypes around increasing control over the actions performed in the indexing pipeline, which might enable such functionality as well, but again, as far as I am aware as of the time of this post, that's all pre-alpha as well, no guarantees of anything enabling such actually making it into a product.
You are correct, that feature as described does not exist in core Splunk. I would definitely suggest logging an Enhancement Request on your company's support entitlement if you have a need for such a feature.
Something that you may be able to do, would be to index the clear text data on one instance, and forward the data or otherwise clone the stream and use regular expressions to remove the sensitive data prior to indexing on a different instance. (Yes I'm hand waving at that a little bit right now, as depending on architecture, networks, and the nature of the data, this can be more difficult than it sounds)... But then only certain users would be able to log into the sensitive instance, and more users would be able to log into the non-sensitive instance... but if a user who can see sensitive data logs into the non-sensitive instance, they would only see the stripped non-sensitive version of the logs. (Which again, depending on the kinds of data, and your company's risk models around such data, this might actually be the correct answer...)
But selecting fields, and replacing them with encrypted tokens at index time, that can be decrypted with a special search command at search time based on your role... I've seen that mocked up as part of a demo in a third party startup's conceptual product, which is not yet in alpha testing. I've also seen some early prototypes around increasing control over the actions performed in the indexing pipeline, which might enable such functionality as well, but again, as far as I am aware as of the time of this post, that's all pre-alpha as well, no guarantees of anything enabling such actually making it into a product.
can you elaborate on the requirement?
what do you mean by "data coming into Splunk tokenized and detokenized based on Splunk user role"?