Alerting

Create alert for not all hosts having data

jpolachak
New Member

I am trying to create a alert/dashboard for our users. I am trying to create a search query where if the named process is not running on the specified hosts in the search to produce an alert. The query works great when the data is there. The issue is what if one of the hosts does not have data because its down. Then If my search included 4 hosts it will now only show 3 hosts with the named process running. I am trying to use this query.

index="os" source="top" host=ns1 OR host=ns2 OR host=ns3 OR host=ns4 named earliest=-2min latest=-1min
|chart count by host

To test this i have just been adding another bogus host to the search query. Then trying to run it to get a result. However, if it doesn't find anything it just shows the ones running. I am new and am sure there must be an easy way to approach this that I am unaware of. I have tried adding "|search count=0" but that doesn't work if the query has no data.

I was trying to originally create a graph that shows a green bar graph if its up and nothing if its down. But now I am just trying to generate an alert. Best scenario would be to come up with some visualization that shows the 4 servers and then shows no data for one of them if it goes down.

Tags (1)
0 Karma

jnussbaum_splun
Splunk Employee
Splunk Employee

Here's one way to do it:

Create a lookup table that captures all the hosts you're wanting to alert on, should named be down OR data is not coming into Splunk.

index="os" source="top" host=ns1 OR host=ns2 OR host=ns3 OR host=ns4 | stats count by host | fields - count | outputlookup named_hosts

Make sure to cast the net wide enough to capture data from ALL of these hosts when making the lookup

Next, create your search with an appendcols inputlookup to supplement potentially missing data with the "source of truth" lookuptable containing all the hosts you'd want to monitor, and filter for those hosts that aren't either 1) running named 2)sending data into Splunk:

index="os" source="top" host=ns1 OR host=ns2 OR host=ns3 OR host=ns4 named earliest=-2min latest=-1min | appendcols [|inputlookup named_hosts] | fillnull | where count<1

You can get crafty to broaden your search for events coming from hosts and count those as "events_from_host" (without named filter), then search for "named_events_from_host" with named filter. So you'll catch each alerting instance independently.

Hope this helps.

Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...