We are interested in knowing if there is a Best Practices guide for proactive and reactive monitoring of Splunk, particularly what thresholds to watch when using the SoS app, and what to alert on in order to understand if there is an issue with a search head, indexer, or heavy/universal forwarder?
Thanks.
This is something that we are likely to cover in the eventual S.o.S User Manual, but until such a time, I can issue the following recommendations:
Leverage the scripted inputs that ship with S.o.S to alert when the resource usage of Splunk processes is unreasonable. The ps_sos.sh
scripted input, for example (and its Windows equivalent, ps_sos.ps1
) track the CPU and memory usage of Splunk processes and categorize them by process type (splunkd, Splunk Web, searches). It's fairly easy to build a search that will send an alert if any splunkd process exceeds 3GB in physical memory usage, for example.
If needed, you can draw inspiration from the searches that power the S.o.S views - search strings of the S.o.S underlying searches should be easily accessible either by clicking on the "view results" link of the corresponding panel or by consulting the in-app help that expands when you click on the "Learn More" button.
This is something that we are likely to cover in the eventual S.o.S User Manual, but until such a time, I can issue the following recommendations:
Leverage the scripted inputs that ship with S.o.S to alert when the resource usage of Splunk processes is unreasonable. The ps_sos.sh
scripted input, for example (and its Windows equivalent, ps_sos.ps1
) track the CPU and memory usage of Splunk processes and categorize them by process type (splunkd, Splunk Web, searches). It's fairly easy to build a search that will send an alert if any splunkd process exceeds 3GB in physical memory usage, for example.
If needed, you can draw inspiration from the searches that power the S.o.S views - search strings of the S.o.S underlying searches should be easily accessible either by clicking on the "view results" link of the corresponding panel or by consulting the in-app help that expands when you click on the "Learn More" button.