Alerting

Alert when a Splunk service is down

nawazns5038
Builder

Hi,

Suppose we have 10 heavy forwarders and want to get alerted if any one of them goes down.
How do we form an alert query.

index=_internal source=*splunkd.log* may work for for a single server, how to extend the query to work for multiple servers.

If we use,
index=_internal source=*splunkd.log* | stats count by host .. It may not work as host is down and won't be included in the results set.

0 Karma
1 Solution

nawazns5038
Builder

Create a lookup with all the required hostnames and use it in the below query.

index=_internal host=*hfwd* | stats count by host
| append [ | inputlookup hfwd_hosts | table host ] | stats sum(count) as count by host | fillnull value=0 | where count =0

View solution in original post

0 Karma

nawazns5038
Builder

Create a lookup with all the required hostnames and use it in the below query.

index=_internal host=*hfwd* | stats count by host
| append [ | inputlookup hfwd_hosts | table host ] | stats sum(count) as count by host | fillnull value=0 | where count =0

0 Karma

woodcock
Esteemed Legend

If you use what you have and add one more line, you have an instant alert.

index=_internal source=*splunkd.log*
| stats dc(host) AS count values(host)
| where count < <known_number_of_hosts>
0 Karma

somesoni2
Revered Legend

This needs one more stats or (just) dc(host) on existing one. Right now the count gives count of evwnts for host.

0 Karma

woodcock
Esteemed Legend

Correct, I really messed that up the first time. Corrected now.

0 Karma

Yorokobi
SplunkTrust
SplunkTrust

Add your heavy forwarders as search peers to your Monitoring Console and enable the "DMC Alert - Search Peer Not Responding" alert.

nawazns5038
Builder

It checks only the indexers and that too only the management port (8089)

0 Karma

xpac
SplunkTrust
SplunkTrust

@Yorokobi is right - if you add the HFs as search peers on your Monitoring console, the MC will contact them via port 8089 and you can use it's built-in alert to get a notification when one of them goes down. Actually works for all Splunk instances, be they indexers, search heads, HFs...

0 Karma

pradeepkumarg
Influencer

You can search metadata and alert if forwarders do not report for more than a certain threshold

| metadata type=hosts | eval age = now() - lastTime | search age > 300 

0 Karma

nawazns5038
Builder

However, in environments with large numbers of values for each category, the data might not be complete. This is intentional and allows the metadata command to operate within reasonable time and memory usage. ... from docs.

I don't think metadata can produce accurate results, I don't see it working

0 Karma

Krishnagrandhi
Explorer

Also you can narrow down search | metadata type=hosts | search host= | eval age = now() - lastTime | search age > 300

OR

| metadata type=hosts | search host=testweb* | eval age = now() - lastTime | search age > 300

0 Karma
Get Updates on the Splunk Community!

Improve Your Security Posture

Watch NowImprove Your Security PostureCustomers are at the center of everything we do at Splunk and security ...

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...