This application provides a ".spl" to install, which is perfect for "single server splunk".
Since we run a clustered Splunk environment (and is normally installed this way on medium to big companies), with separate search heads, master, heavy forwarders, indexers, etc.
Can you provide detailed installation instructions and/or recommendations for this ?
- What do you install on indexers? ...
- What do you install on search heads? ...
- What do you install on heavy forwarders? ...
- If you have many heavy forwarders (for HA), may I install it on all of them? or I would get duplicate content?
Because it depends a lot on if you have index time transformations, or not, search time transformations or not, etc. and how does the application manages the pulling to the Azure.
On big environments, you usually have many heavy forwarders, that are deployed by a deployment server, so if you want this app to be HA (ie. one HF goes down), it should be installed on more than one, but I am not sure how will it work on this clustered configuration pulling from the same place.
Normally, a TA is deployed across all indexers and heavy forwarders, as well as to Search Heads... but since this TA have many python inside, etc, I am not sure what to keep in what app deployed to them, as is not clear what will it need on each case.
Finally... does it correctly uses the CIM model to be compatible with Splunk Enterprise Security App ?
General instructions for installing apps in distributed systems are at https://docs.splunk.com/Documentation/AddOns/released/Overview/Distributedinstall
According to the app's page on splunkbase, it is compatible with CIM version 4.x.
Let see... if you have a API call like input like this, they work as a scripted input, and if they do not store a checkpoint value of some kind, it will end having duplicated data in the indexers because, for example, if you put the input on the indexers (assuming you have many of them, and you deploy from Master), or if you put the input on the heavy forwarders (assuming you have many of them deploying from Deployment Server and you want HA, you deploy on all them) .... you will get duplicated data.
And so... I was thinking more on like a good implementation manual for productive environments on clustered splunk installations running on big companies.
But then I saw the "Beta Test" in the .conf files ... I assume this is provided on a "best effort" policy, but any way I was asking more to Microsoft itself, to see if they have tried this on a clustered environment, and how have they managed this kind of issues.
Since they use a lot of python scripts and such, they may have already faced this problems, and have already a "how to" or "best practices" on this.
Thanks anyway for the manual reference, I have seen this before many times, I was looking more of a detailed reply from Microsoft side on how thir app is supposed to work, not from Splunk side.
Generally speaking, inputs should be enabled in one place. That place should not be an indexer. Put them on your HFs, but disable them on all but one. Part of your DR procedure will be to enable the inputs on the surviving HF.