Alerting

How to set up an alert to trigger only when both Check Point devices in a High Availability pair fail>

Thuan
Explorer

I have a situation where Check Point firewalls work as a pair in HA mode where one device is "hot" while the other is in "stand-by" mode.
I need to provide an alert when both devices in a pair fail as this causes a data outage.

The search below identifies all single devices that do not send logs for the last 5 minutes. Subsequently, the lookup table Checkpoint-Hosts-122115.csv is pairing that device with its HA pair. The search shows that the pairing using the lookup table works.

My issue is that I have not been able to leverage the pair device to compute its own delay. As both devices in a pair have to stop sending logs for the outage condition to be met.

FYI, I use tstats as it is more efficient. Checkpoint logs are coming from 60 HA pairs are very noisy. The search runs every 10 mins.

| tstats latest(_time) AS lastTime WHERE index=checkpoint host=* BY host
| eval current=now()
| eval delay1=current-lastTime
| where delay1 > 300
| dedup host
| lookup Checkpoint-Hosts-122115.csv host-pri AS host OUTPUT host-bak
| rename host AS host-pri
| table host-pri delay1 host-bak

host-pri            delay1    host-bak
go-bldxfwbpz-bak    675       go-bldxfwbpz-pri
go-bldxfwbpz-pri    3482     go-bldxfwbpz-bak
go-bldxfwe-bak     4023      go-bldxfwe-pri
0 Karma
1 Solution

javiergn
Super Champion

Let me see if I get this right. Based on your requirements above and the example provided, you would like to be alerted when both primary and backup are down, and therefore you would expect an alert for the following two hosts correct?

host-pri             delay1    host-bak
go-bldxfwbpz-bak     675       go-bldxfwbpz-pri
go-bldxfwbpz-pri     3482      go-bldxfwbpz-bak

In that case, if you use a subsearch first to return the backup hosts matching your expression above, then use the main search to filter when delay > 300 for those backup hosts and finally using a different lookup that returns your primary from your backup, then I think you should be all right. Something like:

| tstats latest(_time) AS lastTime WHERE index=checkpoint host=* BY host
| search [
   | tstats latest(_time) AS lastTime WHERE index=checkpoint host=* BY host
   | eval current=now()
   | eval delay1=current-lastTime
   | where delay1 > 300
   | dedup host
   | lookup Checkpoint-Hosts-122115.csv host-pri AS host OUTPUT host-bak
   | rename host AS host2
   | rename host-bak AS host
   | return host
]
| eval current=now()
| eval delay1=current-lastTime
| where delay1 > 300
| dedup host
| lookup Checkpoint-Hosts-122115.csv host-bak AS host OUTPUT host-pri
| rename host AS host-bak
| table host-pri host-bak

Let me know if that works.

Thanks,
J

View solution in original post

javiergn
Super Champion

Let me see if I get this right. Based on your requirements above and the example provided, you would like to be alerted when both primary and backup are down, and therefore you would expect an alert for the following two hosts correct?

host-pri             delay1    host-bak
go-bldxfwbpz-bak     675       go-bldxfwbpz-pri
go-bldxfwbpz-pri     3482      go-bldxfwbpz-bak

In that case, if you use a subsearch first to return the backup hosts matching your expression above, then use the main search to filter when delay > 300 for those backup hosts and finally using a different lookup that returns your primary from your backup, then I think you should be all right. Something like:

| tstats latest(_time) AS lastTime WHERE index=checkpoint host=* BY host
| search [
   | tstats latest(_time) AS lastTime WHERE index=checkpoint host=* BY host
   | eval current=now()
   | eval delay1=current-lastTime
   | where delay1 > 300
   | dedup host
   | lookup Checkpoint-Hosts-122115.csv host-pri AS host OUTPUT host-bak
   | rename host AS host2
   | rename host-bak AS host
   | return host
]
| eval current=now()
| eval delay1=current-lastTime
| where delay1 > 300
| dedup host
| lookup Checkpoint-Hosts-122115.csv host-bak AS host OUTPUT host-pri
| rename host AS host-bak
| table host-pri host-bak

Let me know if that works.

Thanks,
J

javiergn
Super Champion

Note: even if my search above works fine it would probably require some tweaking. The variable current should be evaluated once and not twice for instance. You might also want to filter out duplicated results from the final output.

0 Karma

Thuan
Explorer

Good morning Javiern,
It works ! I will do some tweaking as your recommended.
Thank you a great deal!

0 Karma

javiergn
Super Champion

No worries. I'm glad it worked.

Please don't forget to mark it as answered if you liked it so that others can benefit from it.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...