Archive

Alert when a Splunk service is down

Builder

Hi,

Suppose we have 10 heavy forwarders and want to get alerted if any one of them goes down.
How do we form an alert query.

index=_internal source=*splunkd.log* may work for for a single server, how to extend the query to work for multiple servers.

If we use,
index=_internal source=*splunkd.log* | stats count by host .. It may not work as host is down and won't be included in the results set.

0 Karma
1 Solution

Builder

Create a lookup with all the required hostnames and use it in the below query.

index=internal host=*hfwd* | stats count by host
| append [ | inputlookup hfwd
hosts | table host ] | stats sum(count) as count by host | fillnull value=0 | where count =0

View solution in original post

0 Karma

Builder

Create a lookup with all the required hostnames and use it in the below query.

index=internal host=*hfwd* | stats count by host
| append [ | inputlookup hfwd
hosts | table host ] | stats sum(count) as count by host | fillnull value=0 | where count =0

View solution in original post

0 Karma

Esteemed Legend

If you use what you have and add one more line, you have an instant alert.

index=_internal source=*splunkd.log*
| stats dc(host) AS count values(host)
| where count < <known_number_of_hosts>
0 Karma

SplunkTrust
SplunkTrust

This needs one more stats or (just) dc(host) on existing one. Right now the count gives count of evwnts for host.

0 Karma

Esteemed Legend

Correct, I really messed that up the first time. Corrected now.

0 Karma

SplunkTrust
SplunkTrust

Add your heavy forwarders as search peers to your Monitoring Console and enable the "DMC Alert - Search Peer Not Responding" alert.

Builder

It checks only the indexers and that too only the management port (8089)

0 Karma

SplunkTrust
SplunkTrust

@Yorokobi is right - if you add the HFs as search peers on your Monitoring console, the MC will contact them via port 8089 and you can use it's built-in alert to get a notification when one of them goes down. Actually works for all Splunk instances, be they indexers, search heads, HFs...

0 Karma

Influencer

You can search metadata and alert if forwarders do not report for more than a certain threshold

| metadata type=hosts | eval age = now() - lastTime | search age > 300 

0 Karma

Builder

However, in environments with large numbers of values for each category, the data might not be complete. This is intentional and allows the metadata command to operate within reasonable time and memory usage. ... from docs.

I don't think metadata can produce accurate results, I don't see it working

0 Karma

Explorer

Also you can narrow down search | metadata type=hosts | search host= | eval age = now() - lastTime | search age > 300

OR

| metadata type=hosts | search host=testweb* | eval age = now() - lastTime | search age > 300

0 Karma