Alerting

Alert when a Splunk service is down

nawazns5038
Builder

Hi,

Suppose we have 10 heavy forwarders and want to get alerted if any one of them goes down.
How do we form an alert query.

index=_internal source=*splunkd.log* may work for for a single server, how to extend the query to work for multiple servers.

If we use,
index=_internal source=*splunkd.log* | stats count by host .. It may not work as host is down and won't be included in the results set.

0 Karma
1 Solution

nawazns5038
Builder

Create a lookup with all the required hostnames and use it in the below query.

index=_internal host=*hfwd* | stats count by host
| append [ | inputlookup hfwd_hosts | table host ] | stats sum(count) as count by host | fillnull value=0 | where count =0

View solution in original post

0 Karma

nawazns5038
Builder

Create a lookup with all the required hostnames and use it in the below query.

index=_internal host=*hfwd* | stats count by host
| append [ | inputlookup hfwd_hosts | table host ] | stats sum(count) as count by host | fillnull value=0 | where count =0

0 Karma

woodcock
Esteemed Legend

If you use what you have and add one more line, you have an instant alert.

index=_internal source=*splunkd.log*
| stats dc(host) AS count values(host)
| where count < <known_number_of_hosts>
0 Karma

somesoni2
Revered Legend

This needs one more stats or (just) dc(host) on existing one. Right now the count gives count of evwnts for host.

0 Karma

woodcock
Esteemed Legend

Correct, I really messed that up the first time. Corrected now.

0 Karma

Yorokobi
SplunkTrust
SplunkTrust

Add your heavy forwarders as search peers to your Monitoring Console and enable the "DMC Alert - Search Peer Not Responding" alert.

nawazns5038
Builder

It checks only the indexers and that too only the management port (8089)

0 Karma

xpac
SplunkTrust
SplunkTrust

@Yorokobi is right - if you add the HFs as search peers on your Monitoring console, the MC will contact them via port 8089 and you can use it's built-in alert to get a notification when one of them goes down. Actually works for all Splunk instances, be they indexers, search heads, HFs...

0 Karma

pradeepkumarg
Influencer

You can search metadata and alert if forwarders do not report for more than a certain threshold

| metadata type=hosts | eval age = now() - lastTime | search age > 300 

0 Karma

nawazns5038
Builder

However, in environments with large numbers of values for each category, the data might not be complete. This is intentional and allows the metadata command to operate within reasonable time and memory usage. ... from docs.

I don't think metadata can produce accurate results, I don't see it working

0 Karma

Krishnagrandhi
Explorer

Also you can narrow down search | metadata type=hosts | search host= | eval age = now() - lastTime | search age > 300

OR

| metadata type=hosts | search host=testweb* | eval age = now() - lastTime | search age > 300

0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...