Monitoring Splunk

Disk Rate of Change Monitor

ericca
New Member

I'd like to detect when disk rate of change exceeds 20% within a 1 hour period and the search below works for a single host and fails when searching all hosts.

Search:
index="os" sourcetype="df" host="*" | multikv fields FileSystem, UsePct | strcat host '@' Filesystem Host_FileSystem |eval UsedPct=rtrim(UsePct,"%")| convert num(usedPct)| stats first(UsedPct) as previous, last(UsedPct) as current | eval rateofchange=current/previous | rename rateofchange as "% Rate of Change" | where current/previous > 20

Tags (2)
0 Karma
1 Solution

lguinn2
Legend

I think it will work if you just add "by Host_FileSystem " to your stats command. I would also use earliest and latest instead of first and last. And I think that you may need need to adjust your calculation.
You are expressing the disk usage in percentage already. What does "% Rate of Change" really mean? If you were using 30% of the filesystem and are now using 40% of the file system one hour later, the rate of change is 10% per hour. Or you could compute the "rate of change" based on prior usage, so it would be 10/30*100 = 33%. Dividing current by previous gets you a ratio, but not a percentage...

index="os" sourcetype="df" host="*" 
| multikv fields FileSystem, UsePct  
| strcat host '@' Filesystem Host_FileSystem 
| eval UsedPct=rtrim(UsePct,"%")
| convert num(usedPct)
| stats earliest(UsedPct) as previous, latest(UsedPct) as current by Host_FileSystem 
| eval rateofchange=round((current-previous)/previous,2) 
| where rateofchange > 20
| rename rateofchange as "% Rate of Change" 

View solution in original post

0 Karma

lguinn2
Legend

I think it will work if you just add "by Host_FileSystem " to your stats command. I would also use earliest and latest instead of first and last. And I think that you may need need to adjust your calculation.
You are expressing the disk usage in percentage already. What does "% Rate of Change" really mean? If you were using 30% of the filesystem and are now using 40% of the file system one hour later, the rate of change is 10% per hour. Or you could compute the "rate of change" based on prior usage, so it would be 10/30*100 = 33%. Dividing current by previous gets you a ratio, but not a percentage...

index="os" sourcetype="df" host="*" 
| multikv fields FileSystem, UsePct  
| strcat host '@' Filesystem Host_FileSystem 
| eval UsedPct=rtrim(UsePct,"%")
| convert num(usedPct)
| stats earliest(UsedPct) as previous, latest(UsedPct) as current by Host_FileSystem 
| eval rateofchange=round((current-previous)/previous,2) 
| where rateofchange > 20
| rename rateofchange as "% Rate of Change" 
0 Karma

ericca
New Member

Your suggestion worked !!! I modified the stats command per your suggestion to return a percentage.

index="os" sourcetype="df" host="*" earliest=-2h@h latest=@h
| multikv fields FileSystem, UsePct

| strcat host '@' Filesystem Host_FileSystem
| eval UsedPct=rtrim(UsePct,"%")
| convert num(usedPct)
| stats earliest(UsedPct) as previous, latest(UsedPct) as current by Host_FileSystem
| eval rateofchange=round(((current-previous)/previous)*100,2)
| where rateofchange > 1
| table _time host Host_FileSystem previous current rateofchange

0 Karma
Get Updates on the Splunk Community!

What the End of Support for Splunk Add-on Builder Means for You

Hello Splunk Community! We want to share an important update regarding the future of the Splunk Add-on Builder ...

Solve, Learn, Repeat: New Puzzle Channel Now Live

Welcome to the Splunk Puzzle PlaygroundIf you are anything like me, you love to solve problems, and what ...

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...