All Apps and Add-ons

Monitoring Processes using metrics and SAI

Skins
Path Finder

Is it possible to monitor whether processes are running using metrics data and SAI ?

I want to push out a config via UF to say monitor these X processes - and alert should any of them stop ?

gratzi

... also - i have a single UF reporting to Splunk - my SAI "Overview" dashboard - looks like this (last 15 mins)

UPTIME(h:m:s) 
top 10547   root    0   0.2 00: 00: 00 
top 10817   root    0   0.2 00: 00: 00
top 11789   root    0   0.2 00: 00: 00
top 13036   root    0   0.2 00: 00: 00
top 14295   root    0   0.2 00: 00: 00
top 17779   root    0   0.1 4+18: 58: 48

What is this telling me - i only have one instance of top running - which shows as 4days+ - where are all the other pids coming from?

ps -ef | grep -i top
root     17779 24995  0 Feb21 pts/1    00:03:02 top
root     30528 26798  0 12:27 pts/0    00:00:00 grep --color=auto -i top
#
0 Karma

dagarwal_splunk
Splunk Employee
Splunk Employee

The Overview dashboard for process looks only last 5 minutes for all the metrics.

You can also look into "Analysis" page to see the graphs for the process metrics. Here you can also split by dimensions like pid, process_name etc.. and set alerts as well.

Right now, you cannot set alert on SAI for process stopped/not running. Alerts won't fire when data stop coming for a particular process.

0 Karma

dagarwal_splunk
Splunk Employee
Splunk Employee

Are you using SAI's collectd processmon plugin?

How are you getting those process data?

0 Karma

dagarwal_splunk
Splunk Employee
Splunk Employee

All the other ids are for the top processes that you might have started for a short time in the past and stopped..

0 Karma

Skins
Path Finder

With this app:

No collectd req
https://splunkbase.splunk.com/app/4856/

Why would it show processes started in the past over a 15m window ?

Gratzi.

0 Karma
Get Updates on the Splunk Community!

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...

Getting Started with AIOps: Event Correlation Basics and Alert Storm Detection in ...

Getting Started with AIOps:Event Correlation Basics and Alert Storm Detection in Splunk IT Service ...

Register to Attend BSides SPL 2022 - It's all Happening October 18!

Join like-minded individuals for technical sessions on everything Splunk!  This is a community-led and run ...