Alerting

Eliminating false positive while monitoring Hot/Cold deplolyment?

bigll
Path Finder

Hi.

 

I am monitoring service status on number of paired servers.

While service is running on server1 report on service stopped on server 2 is a false positive

But if it stopped on server 1 and did not start on server 2 it's a case when I need to be alerted.

Any example of alert logic you can share?

Thank you

Labels (1)
Tags (1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @bigll,

the main problem is to identify the paired hosts:

if you have few paired hosts, you can create a search like this:

<your_search>
| eval paired_host=case(host=host1,"Paired_host_1",host=host2,"Paired_host_1",host=host3,"Paired_host_2",host=host4,"Paired_host_2")
| stats dc(host) AS host_count values(host) AS host BY paired_host
| eval status=if(host_count=2,"Both active","Only ".paired_host." active")

If instead you have many paired host and it's too difficoult to create one search, you have to create a lookup  (called e.g. paired_hosts.csv) containing two fields:

  • host
  • paired_host

and then run a search like this:

<your_search>
| lookup paired_host.csv host OUTPUT paired_host
| stats dc(host) AS host_count values(host) AS host BY paired_host
| eval status=if(host_count=2,"Both active","Only ".paired_host." active")

In both cases, you can create your alerts or your dashboard.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...

Stay Connected: Your Guide to October Tech Talks, Office Hours, and Webinars!

What are Community Office Hours? Community Office Hours is an interactive 60-minute Zoom series where ...