Question, we are trying to monitor disk space usage in Splunk ITSI.
We are trying to use templates as much as possible in our environment. What I am trying to understand is how doe we monitor drive space when it comes to each individual server having multiple file systems.
Do we have to write multi KPI searches for each and every server/entity if we want to identify issues like a Disk space becoming full if a particular log file was writing debug messages?
Do we have to write multi KIP searches if we want to identify CPU /Memory was 100% since a runaway process or service was consuming very high CPU /Memory resources?
For these types of root cause correlation what would be a good way of representing this visually? It looks like the deep dive does not provide this level of visibility and it seems to me that it would require manual correlation. Is my understanding correct on this one?
I guess what i am trying to say is... i want to see for example... which log file caused the disk space to go high and be able to see the log entry.. for that somehow from the same view...
Similar for the CPU... how do we see which CPU process ended up causing CPU to spike.. We see the metric based data from deep dive view, but it would be also nice to see the actual process... or drill into the metric somehow... and for it to show... why the spike has happened... Is this something that can be done from deep dive view? Or we need to create some individual manual type of correlations, and jump outside of the native ITSI deep dive functionality?
Did you ever make any progress with this? One thought I have is to create a kpi per disk but it feels bad. You run into the same problem with network interfaces as well.