Monitoring Splunk

Disk Rate of Change Monitor

ericca
New Member

I'd like to detect when disk rate of change exceeds 20% within a 1 hour period and the search below works for a single host and fails when searching all hosts.

Search:
index="os" sourcetype="df" host="*" | multikv fields FileSystem, UsePct | strcat host '@' Filesystem Host_FileSystem |eval UsedPct=rtrim(UsePct,"%")| convert num(usedPct)| stats first(UsedPct) as previous, last(UsedPct) as current | eval rateofchange=current/previous | rename rateofchange as "% Rate of Change" | where current/previous > 20

Tags (2)
0 Karma
1 Solution

lguinn2
Legend

I think it will work if you just add "by Host_FileSystem " to your stats command. I would also use earliest and latest instead of first and last. And I think that you may need need to adjust your calculation.
You are expressing the disk usage in percentage already. What does "% Rate of Change" really mean? If you were using 30% of the filesystem and are now using 40% of the file system one hour later, the rate of change is 10% per hour. Or you could compute the "rate of change" based on prior usage, so it would be 10/30*100 = 33%. Dividing current by previous gets you a ratio, but not a percentage...

index="os" sourcetype="df" host="*" 
| multikv fields FileSystem, UsePct  
| strcat host '@' Filesystem Host_FileSystem 
| eval UsedPct=rtrim(UsePct,"%")
| convert num(usedPct)
| stats earliest(UsedPct) as previous, latest(UsedPct) as current by Host_FileSystem 
| eval rateofchange=round((current-previous)/previous,2) 
| where rateofchange > 20
| rename rateofchange as "% Rate of Change" 

View solution in original post

0 Karma

lguinn2
Legend

I think it will work if you just add "by Host_FileSystem " to your stats command. I would also use earliest and latest instead of first and last. And I think that you may need need to adjust your calculation.
You are expressing the disk usage in percentage already. What does "% Rate of Change" really mean? If you were using 30% of the filesystem and are now using 40% of the file system one hour later, the rate of change is 10% per hour. Or you could compute the "rate of change" based on prior usage, so it would be 10/30*100 = 33%. Dividing current by previous gets you a ratio, but not a percentage...

index="os" sourcetype="df" host="*" 
| multikv fields FileSystem, UsePct  
| strcat host '@' Filesystem Host_FileSystem 
| eval UsedPct=rtrim(UsePct,"%")
| convert num(usedPct)
| stats earliest(UsedPct) as previous, latest(UsedPct) as current by Host_FileSystem 
| eval rateofchange=round((current-previous)/previous,2) 
| where rateofchange > 20
| rename rateofchange as "% Rate of Change" 
0 Karma

ericca
New Member

Your suggestion worked !!! I modified the stats command per your suggestion to return a percentage.

index="os" sourcetype="df" host="*" earliest=-2h@h latest=@h
| multikv fields FileSystem, UsePct

| strcat host '@' Filesystem Host_FileSystem
| eval UsedPct=rtrim(UsePct,"%")
| convert num(usedPct)
| stats earliest(UsedPct) as previous, latest(UsedPct) as current by Host_FileSystem
| eval rateofchange=round(((current-previous)/previous)*100,2)
| where rateofchange > 1
| table _time host Host_FileSystem previous current rateofchange

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...