Alerting

Alert for Finding Systems that Stopped Reporting

uofrmike
New Member

I created the following alert for finding systems that have recently stopped reporting.  I haven't seen a similar solution to this problem so I thought I would post it here in hopes that it might help others.

 

This alert will find hosts that haven't sent data within the last day, but have in the days previous.  It's set for the 3 days prior, but can be changed to a longer duration.  It will report each host for 3 days to allow time for the admins to take action to get the system online.

 

Suggest that the alert be set to run daily.

 

 

index=_internal source=*license_usage.log type=Usage earliest=-5d@d latest=-2d@d
| eval HostSource=idx . " / " . h . " / " . st
| fields HostSource
| dedup HostSource
| eval PAST="YES"
| join type=outer HostSource [ search index=_internal source=*license_usage.log type=Usage earliest=-2d@d latest=-1d@d | eval HostSource=idx . " / " . h . " / " . st | fields HostSource | dedup HostSource | eval PRESENT="YES" ]
| where isnull(PRESENT)
| table HostSource
| sort by HostSource

Labels (1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @uofrmike,

using join with so many events (like _internal) will surely be a very slow search.

Please try a different approach:

index=_internal earliest=-3d@d latest=now
| eval day=if(now()-_time<86400,"Last day","Previous days")
| stata dc(day) AS dc_day values(day) AS day count BY host
| where dc_day=1 AND day="Previous days"

Ciao.

Giuseppe

 

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!