Splunk Search

How to detect not-reporting hosts?

unitedmarsupial
Path Finder

We have a large number of hosts reporting to Splunk, and sometimes (rarely), some of them stop sending events. Is there an elegant search for hosts, which have last reported anything more than T ago?

I'd like to make an alert for T being above, say, 6 hours or so...

0 Karma

mattymo
Splunk Employee
Splunk Employee

can't you just talk to the humans that do have access to install apps???

Much easier than you re-inventing the wheel. Also based on the question below about why a lookup is necessary, I would recommend you save the scars of learning 😉

Plus once your alert goes nuts...you'll see why the app is so cool

- MattyMo
0 Karma

unitedmarsupial
Path Finder

This is, what I ended up using -- thanks to @gcusello for the stats ... BY host idea:

a search for normal events
| fields host, _time
| stats max(_time) AS most_recent by host
| where most_recent < relative_time(now(), "-5h")
| eval most_recent = strftime(most_recent, "%F %T")

The above performs whatever search you typically use, then looks for hosts, that haven't reported any search-satisfying matches within the specified time (5 hours in the above example). The search time-range is set by the usual time-picker, which should, obviously, include the alert time.

(The relative_time call can, probably, be expressed nicer, but this works.)

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @unitedmarsupials,
you have to create a lookup (e.g. called perimeter.csv with a field called host) containing all the hosts to monitor; then you have to run a search like this:

| metasearch index=_internal
| eval host=lower(host)
| stats count BY host
| append [ | inputlookup perimeter.csv | eval host=lower(host), count=0 | fields host count ]
| stats sum(count) AS total BY host
| where total=0

In this way you have all the hosts from your list that didn't send logs in the monitoring period.
You can create an alert to run e.g. every 5 minutes.
If you delete the last row and add the row | eval status=if(total=0,"Missing","Up") you have a dashboard that display the host status.

Ciao.
Giuseppe

0 Karma

unitedmarsupial
Path Finder

Thanks for the ideas, but why do I need to create a lookup? The hosts are already known to Splunk -- all those, that have reported in the last, say, 30 days, but have not reported in the last 5 hours.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @unitedmarsupials,
A manually managed lookup is the easiest way to be sure about the monitoring perimeter: if you e.g. take the hosts of last 24 hours, you don't check hosts that didn't send in the last period!

Anyway, it this could be sufficient for you, you can schedule a search every night that populates the perimeter.csv lookup so you haven't to do nothing.

| metedata index=_internal earliest=-24h
| dedup host
| sort host
| table host
| outputlookup perimeter.csv

and then run the above search e.g. every 5 minutes.

Ciao.
Giuseppe

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @unitedmarsupials,
your solution surely solves your functional need, but I think that's a very slow search if you use _internal (this means that you cannot execute it in an alarm e.g. every five minutes!) and a not sure search if you use a different index (because it's possible that you don't have nothing to receive on that index!).
In addition, you don't check servers that didn't send logs in the search timeframe.

I used the above solutions for an alert (with a frequency of 5 minutes) that's running from many years!

Ciao and next time!
Giuseppe

0 Karma

mattymo
Splunk Employee
Splunk Employee

please check out "trackme" on Splunkbase by the amazing app by @guilmxm

https://splunkbase.splunk.com/app/4621/

Great app that helps you manage and alert on data sources!

- MattyMo

unitedmarsupial
Path Finder

Thanks, but I don't have the access necessary to install new apps...

0 Karma

gjanders
SplunkTrust
SplunkTrust

Great app! If you want an alternative to that app try Broken Hosts or Meta Woot!

mattymo
Splunk Employee
Splunk Employee

yep! Honorable mention for meta woot for sure!

- MattyMo
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...