Monitoring Splunk

Disk Rate of Change Monitor

ericca
New Member

I'd like to detect when disk rate of change exceeds 20% within a 1 hour period and the search below works for a single host and fails when searching all hosts.

Search:
index="os" sourcetype="df" host="*" | multikv fields FileSystem, UsePct | strcat host '@' Filesystem Host_FileSystem |eval UsedPct=rtrim(UsePct,"%")| convert num(usedPct)| stats first(UsedPct) as previous, last(UsedPct) as current | eval rateofchange=current/previous | rename rateofchange as "% Rate of Change" | where current/previous > 20

Tags (2)
0 Karma
1 Solution

lguinn2
Legend

I think it will work if you just add "by Host_FileSystem " to your stats command. I would also use earliest and latest instead of first and last. And I think that you may need need to adjust your calculation.
You are expressing the disk usage in percentage already. What does "% Rate of Change" really mean? If you were using 30% of the filesystem and are now using 40% of the file system one hour later, the rate of change is 10% per hour. Or you could compute the "rate of change" based on prior usage, so it would be 10/30*100 = 33%. Dividing current by previous gets you a ratio, but not a percentage...

index="os" sourcetype="df" host="*" 
| multikv fields FileSystem, UsePct  
| strcat host '@' Filesystem Host_FileSystem 
| eval UsedPct=rtrim(UsePct,"%")
| convert num(usedPct)
| stats earliest(UsedPct) as previous, latest(UsedPct) as current by Host_FileSystem 
| eval rateofchange=round((current-previous)/previous,2) 
| where rateofchange > 20
| rename rateofchange as "% Rate of Change" 

View solution in original post

0 Karma

lguinn2
Legend

I think it will work if you just add "by Host_FileSystem " to your stats command. I would also use earliest and latest instead of first and last. And I think that you may need need to adjust your calculation.
You are expressing the disk usage in percentage already. What does "% Rate of Change" really mean? If you were using 30% of the filesystem and are now using 40% of the file system one hour later, the rate of change is 10% per hour. Or you could compute the "rate of change" based on prior usage, so it would be 10/30*100 = 33%. Dividing current by previous gets you a ratio, but not a percentage...

index="os" sourcetype="df" host="*" 
| multikv fields FileSystem, UsePct  
| strcat host '@' Filesystem Host_FileSystem 
| eval UsedPct=rtrim(UsePct,"%")
| convert num(usedPct)
| stats earliest(UsedPct) as previous, latest(UsedPct) as current by Host_FileSystem 
| eval rateofchange=round((current-previous)/previous,2) 
| where rateofchange > 20
| rename rateofchange as "% Rate of Change" 
0 Karma

ericca
New Member

Your suggestion worked !!! I modified the stats command per your suggestion to return a percentage.

index="os" sourcetype="df" host="*" earliest=-2h@h latest=@h
| multikv fields FileSystem, UsePct

| strcat host '@' Filesystem Host_FileSystem
| eval UsedPct=rtrim(UsePct,"%")
| convert num(usedPct)
| stats earliest(UsedPct) as previous, latest(UsedPct) as current by Host_FileSystem
| eval rateofchange=round(((current-previous)/previous)*100,2)
| where rateofchange > 1
| table _time host Host_FileSystem previous current rateofchange

0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud's AI Assistant in Action Series: Auditing Compliance and ...

This is the third post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how to ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

What You Read The Most: Splunk Lantern’s Most Popular Articles!

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...