Splunk Search

Disk Space Monitoring in Splunk ITSI and CPU & memory corelation


Question, we are trying to monitor disk space usage in Splunk ITSI.

We are trying to use templates as much as possible in our environment.  What I am trying to understand is how doe we monitor drive space when it comes to each individual server having multiple file systems.

Do we have to write multi KPI searches for each and every server/entity if we want to identify issues like a Disk space becoming full if a particular log file was writing debug messages?

Do we have to write multi KIP searches if we want to identify  CPU /Memory was 100% since a runaway process or service was consuming very high CPU /Memory resources?

For these types of root cause correlation what would be a good way of representing this visually?  It   looks like the deep dive does not provide this level of visibility and it seems to me that it would require manual correlation.  Is my understanding correct on this one?


I guess what i am trying to say is... i want to see for example... which log file caused the disk space to go high and be able to see the log entry.. for that somehow from the same view...  

Similar for the CPU... how do we see which CPU process ended up causing CPU to spike..  We see the metric based data from deep dive view, but it would be also nice to see the actual process... or drill into the metric somehow... and for it to show... why the spike has happened... Is this something that can be done from deep dive view? Or we need to create some individual manual type of correlations, and jump outside of the native ITSI deep dive functionality?  





Labels (1)
0 Karma


Did you ever make any progress with this?  One thought I have is to create a kpi per disk but it feels bad.  You run into the same problem with network interfaces as well.  

0 Karma
Get Updates on the Splunk Community!

Observability | How to Think About Instrumentation Overhead (White Paper)

Novice observability practitioners are often overly obsessed with performance. They might approach ...

Cloud Platform | Get Resiliency in the Cloud Event (Register Now!)

IDC Report: Enterprises Gain Higher Efficiency and Resiliency With Migration to Cloud  Today many enterprises ...

The Great Resilience Quest: 10th Leaderboard Update

The tenth leaderboard update (11.23-12.05) for The Great Resilience Quest is out >> As our brave ...