How to add OPTIONS to the Splunk_TA_nix scripts?

thebeno

I want to focus your attention on the method of collecting CPU utilization data in Splunk_TA_nix (cpu_metric.sh).

I have been dealing with many false positive alerts regarding CPU usage in our organization.

We have ITSI implemented and use Splunk_TA_nix to collect data.

An alert is generated when 2 values of CPU usage > 90%.

We collect values every 5 minutes.

Script for collecting this data (Splunk_TA_nix/bin/cpu_metric.sh) use the command sar -P ALL 1 1.

This command will display the CPU load within 1 second.

If used for CPU monitoring in our setup (every 5 min)

we only have information about 1 second out of five minutes.

Based on this data we evaluate CPU usage.

Normally the CPU usage fluctuates depending on how the commands are started, how long they run, and how difficult they are.

With this method of measurement, it happens quite often that 2 values cross the threshold in a row. Based on this, an alert is subsequently generated.

For monitoring, however, it is important to know the average CPU utilization and not random peaks.

When collecting average values, such false positive alerts would not occur (if the CPU is not overloaded).

The standard way good administrators test CPU usage is, for example: sar 120 1 when they get an average CPU usage in 2 minutes. Data collection in sar via cron was once recommended to be set up like this:

*/10 * * * * root /usr/lib64/sa/sa1 -S XALL 600 1.

This setup collected the average CPU usage over a 10-minute period, wrote this value to a sar file, and repeated this every 10 minutes.

Such a setting gives a real overview of how the CPU is pulled out.

Splunk does not provide a reasonable way to set these values in the cpu_metric.sh script.

The only way to solve it is to copy this script and modify it according to yourself.

However, the connection to Splunk_TA_nix will be lost. What happens when Splunk_TA_nix is upgraded?

My preference is to enable CPU data collection by introducing the following stanza in our application (deployed via the deployment server) which is linked to Splunk_TA_nix.

[script://$SPLUNK_HOME/etc/apps/Splunk_TA_nix/bin/cpu_metric.sh]

disabled = false

index = unix_perfmon_metrics

But this method does not give us the possibility to set OPTIONS for sar.

It would be ideal if something like this could be done:

[script://./bin/my_cpu_metric.sh]

disabled = false

index = unix_perfmon_metrics

./bin/my_cpu_metric.sh

exec $SPLUNK_HOME/etc/apps/Splunk_TA_nix/bin/cpu_metric.sh 120 1

But this doesn't work.

It would not be necessary for cpu_metric.sh to be able to process some input settings and modify the use of the sar command.

The same can also be applied to other scripts in this TA.

If you have similar experiences, feel free to share them. If my concerns are justified, it would be right if this TA would be updated and give administrators the opportunity to set better metrics collection parameters.

What do you think?

PickleRick

This is a scripted input so it doesn't have all the mechanics associated with modular inputs - you cannot pass parameters to it by setting config items in input config stanza. But it works on UF whereas modular inputs don't.

Anyway, the scripts for ta_nix are more like examples to tune and adjust to your needs than ready-for-production.

thebeno

Hi Rick,

thanks for reply.
Many customers are using this app as final product from Splunk.
We would like to enable injections as easy as possible and not break connections between Splunk_TA_nix and our custom app. You can see my example in article.

Only think what is needed for this to work is small change in scripts.
Here is very easy and dirty example how the script could be improved:

diff cpu_metric.sh cpu_metric_new.sh
4a5,11
> # OPTIONS
> if [ "$#" -eq 1 ]; then
> OPTIONS="$1"
> else
> OPTIONS="-P ALL 1 1"
> fi
>
24c31,32
< CMD='sar -P ALL 1 1'
---
> #CMD='sar -P ALL 1 1'
> CMD="sar $OPTIONS"

I hope I am not the only one who will appreciate this.

How to add OPTIONS to the Splunk_TA_nix scripts?

inputs.conf

Linux

scripted input

universal forwarder

Enterprise Security Content Update (ESCU) | New Releases

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

Index This | What are the 12 Days of Splunk-mas?