Hi,
I'm writing an integration for one of our security solutions.
I'm implementing an alert action, and I want the following to happen:
So I set up the Search Head as a Deployment server in order to deploy the scripts to the endpoints, which have a Universal Forwarder installed on them, but I don't know how to trigger that script to run after the alert is raised.
I thought about maybe defining a script input on the Forwarders inputs.conf and then every time it runs, check somehow if I need to run the script's core logic or not, but I'm not really sure how to signal the Forwarder it needs to run the core logic or not.
If anyone has a suggestion, I would love to hear it
Thanks
This is long winded and hand wavy, but my gut instinct, is that your Alert action might fare better invoking the API of an Endpoint Detection and Response agent (something like Tanium), whose API you would trigger specific questions to gather data. Even better, the alert action may kick off a playbook in a Security Orchestration, Automation, and Response (SOAR) platform like Phantom, in which data collection is just one step to your automated response.
But let's talk about options with Splunk itself.
As you mentioned that the UFs on your endpoints are already checking in to your Search Head (who is also acting as a deployment server)... One option might be to leverage that to dynamically add your endpoint to a known server class with the REST API. This however presents some challenges... First of all, while current state the SH and DS are the same machine, as you scale and manage more forwarders, and/or your environment grows and you have more search activity, the DS will likely become a dedicated Splunk instance.... whether through the DS needing to handle more traffic and/or the SH needing to grow to become a Search Head Cluster... eventually dedicated instances become inevitable. As a result your script will need to be able to handle logging in to a different instance and invoking an API call. The second problem here, is that the DS model is in fact a pull model. Forwarders check in to the Deployment server on an interval to see what configurations they should have. As a result it becomes more difficult to know when you can remove the instance from the server class assuming this is not data you want to gather on a regular basis. (and if something happened to the UF on the endpoint... it may not even pull that configuration at all).
Second, as we're already logging into other nodes, another option may be to log into UFs APIs directly, leveraging the REST API to ensure that you turn on the particular node. Just like before, if the forwarder is down, there would be problems, but additionally, the connection flow is now from the SH to the UF (which might not be permitted depending on your deployment scenario). Furthermore, as UFs would have their admin passwords, you probably need to come up with a scheme to ensure that there is an account as part of every UF to permit your SH to connect to (along with proper things like password/credential rotation, but ensuring that the script from the SH still knows the corresponding credential.
Having the SH directly invoke a remote execution process (such as SSH or WinRM), may be possible, but again credential management.
All that said... I feel like the proper solution would be one where the SH alert action invokes a SOAR workflow, which can orchestrate the proper set of information gathering and responses... and if you have a system designed for managing endpoint agents and invoking scripts across the enterprise, you can integrate that as part of the workflow... It saves you from having to build in a lot of these edge cases in your script, instead deferring such work to tools designed for such things.
This is long winded and hand wavy, but my gut instinct, is that your Alert action might fare better invoking the API of an Endpoint Detection and Response agent (something like Tanium), whose API you would trigger specific questions to gather data. Even better, the alert action may kick off a playbook in a Security Orchestration, Automation, and Response (SOAR) platform like Phantom, in which data collection is just one step to your automated response.
But let's talk about options with Splunk itself.
As you mentioned that the UFs on your endpoints are already checking in to your Search Head (who is also acting as a deployment server)... One option might be to leverage that to dynamically add your endpoint to a known server class with the REST API. This however presents some challenges... First of all, while current state the SH and DS are the same machine, as you scale and manage more forwarders, and/or your environment grows and you have more search activity, the DS will likely become a dedicated Splunk instance.... whether through the DS needing to handle more traffic and/or the SH needing to grow to become a Search Head Cluster... eventually dedicated instances become inevitable. As a result your script will need to be able to handle logging in to a different instance and invoking an API call. The second problem here, is that the DS model is in fact a pull model. Forwarders check in to the Deployment server on an interval to see what configurations they should have. As a result it becomes more difficult to know when you can remove the instance from the server class assuming this is not data you want to gather on a regular basis. (and if something happened to the UF on the endpoint... it may not even pull that configuration at all).
Second, as we're already logging into other nodes, another option may be to log into UFs APIs directly, leveraging the REST API to ensure that you turn on the particular node. Just like before, if the forwarder is down, there would be problems, but additionally, the connection flow is now from the SH to the UF (which might not be permitted depending on your deployment scenario). Furthermore, as UFs would have their admin passwords, you probably need to come up with a scheme to ensure that there is an account as part of every UF to permit your SH to connect to (along with proper things like password/credential rotation, but ensuring that the script from the SH still knows the corresponding credential.
Having the SH directly invoke a remote execution process (such as SSH or WinRM), may be possible, but again credential management.
All that said... I feel like the proper solution would be one where the SH alert action invokes a SOAR workflow, which can orchestrate the proper set of information gathering and responses... and if you have a system designed for managing endpoint agents and invoking scripts across the enterprise, you can integrate that as part of the workflow... It saves you from having to build in a lot of these edge cases in your script, instead deferring such work to tools designed for such things.
Thank you for the detailed response.
I think that your gut instinct is correct, and we should either have an agent installed on the endpoints, or use a SOAR platform.
Thanks again