Monitoring Splunk

Disk Rate of Change Monitor

ericca
New Member

I'd like to detect when disk rate of change exceeds 20% within a 1 hour period and the search below works for a single host and fails when searching all hosts.

Search:
index="os" sourcetype="df" host="*" | multikv fields FileSystem, UsePct | strcat host '@' Filesystem Host_FileSystem |eval UsedPct=rtrim(UsePct,"%")| convert num(usedPct)| stats first(UsedPct) as previous, last(UsedPct) as current | eval rateofchange=current/previous | rename rateofchange as "% Rate of Change" | where current/previous > 20

Tags (2)
0 Karma
1 Solution

lguinn2
Legend

I think it will work if you just add "by Host_FileSystem " to your stats command. I would also use earliest and latest instead of first and last. And I think that you may need need to adjust your calculation.
You are expressing the disk usage in percentage already. What does "% Rate of Change" really mean? If you were using 30% of the filesystem and are now using 40% of the file system one hour later, the rate of change is 10% per hour. Or you could compute the "rate of change" based on prior usage, so it would be 10/30*100 = 33%. Dividing current by previous gets you a ratio, but not a percentage...

index="os" sourcetype="df" host="*" 
| multikv fields FileSystem, UsePct  
| strcat host '@' Filesystem Host_FileSystem 
| eval UsedPct=rtrim(UsePct,"%")
| convert num(usedPct)
| stats earliest(UsedPct) as previous, latest(UsedPct) as current by Host_FileSystem 
| eval rateofchange=round((current-previous)/previous,2) 
| where rateofchange > 20
| rename rateofchange as "% Rate of Change" 

View solution in original post

0 Karma

lguinn2
Legend

I think it will work if you just add "by Host_FileSystem " to your stats command. I would also use earliest and latest instead of first and last. And I think that you may need need to adjust your calculation.
You are expressing the disk usage in percentage already. What does "% Rate of Change" really mean? If you were using 30% of the filesystem and are now using 40% of the file system one hour later, the rate of change is 10% per hour. Or you could compute the "rate of change" based on prior usage, so it would be 10/30*100 = 33%. Dividing current by previous gets you a ratio, but not a percentage...

index="os" sourcetype="df" host="*" 
| multikv fields FileSystem, UsePct  
| strcat host '@' Filesystem Host_FileSystem 
| eval UsedPct=rtrim(UsePct,"%")
| convert num(usedPct)
| stats earliest(UsedPct) as previous, latest(UsedPct) as current by Host_FileSystem 
| eval rateofchange=round((current-previous)/previous,2) 
| where rateofchange > 20
| rename rateofchange as "% Rate of Change" 
0 Karma

ericca
New Member

Your suggestion worked !!! I modified the stats command per your suggestion to return a percentage.

index="os" sourcetype="df" host="*" earliest=-2h@h latest=@h
| multikv fields FileSystem, UsePct

| strcat host '@' Filesystem Host_FileSystem
| eval UsedPct=rtrim(UsePct,"%")
| convert num(usedPct)
| stats earliest(UsedPct) as previous, latest(UsedPct) as current by Host_FileSystem
| eval rateofchange=round(((current-previous)/previous)*100,2)
| where rateofchange > 1
| table _time host Host_FileSystem previous current rateofchange

0 Karma
Get Updates on the Splunk Community!

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...

Explore the Latest Educational Offerings from Splunk [January 2025 Updates]

At Splunk Education, we are committed to providing a robust learning experience for all users, regardless of ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...