Alerting

Alert for Finding Systems that Stopped Reporting

uofrmike
New Member

I created the following alert for finding systems that have recently stopped reporting.  I haven't seen a similar solution to this problem so I thought I would post it here in hopes that it might help others.

 

This alert will find hosts that haven't sent data within the last day, but have in the days previous.  It's set for the 3 days prior, but can be changed to a longer duration.  It will report each host for 3 days to allow time for the admins to take action to get the system online.

 

Suggest that the alert be set to run daily.

 

 

index=_internal source=*license_usage.log type=Usage earliest=-5d@d latest=-2d@d
| eval HostSource=idx . " / " . h . " / " . st
| fields HostSource
| dedup HostSource
| eval PAST="YES"
| join type=outer HostSource [ search index=_internal source=*license_usage.log type=Usage earliest=-2d@d latest=-1d@d | eval HostSource=idx . " / " . h . " / " . st | fields HostSource | dedup HostSource | eval PRESENT="YES" ]
| where isnull(PRESENT)
| table HostSource
| sort by HostSource

Labels (1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @uofrmike,

using join with so many events (like _internal) will surely be a very slow search.

Please try a different approach:

index=_internal earliest=-3d@d latest=now
| eval day=if(now()-_time<86400,"Last day","Previous days")
| stata dc(day) AS dc_day values(day) AS day count BY host
| where dc_day=1 AND day="Previous days"

Ciao.

Giuseppe

 

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.