Alerting

How to set up an alert to trigger only when both Check Point devices in a High Availability pair fail>

Thuan
Explorer

I have a situation where Check Point firewalls work as a pair in HA mode where one device is "hot" while the other is in "stand-by" mode.
I need to provide an alert when both devices in a pair fail as this causes a data outage.

The search below identifies all single devices that do not send logs for the last 5 minutes. Subsequently, the lookup table Checkpoint-Hosts-122115.csv is pairing that device with its HA pair. The search shows that the pairing using the lookup table works.

My issue is that I have not been able to leverage the pair device to compute its own delay. As both devices in a pair have to stop sending logs for the outage condition to be met.

FYI, I use tstats as it is more efficient. Checkpoint logs are coming from 60 HA pairs are very noisy. The search runs every 10 mins.

| tstats latest(_time) AS lastTime WHERE index=checkpoint host=* BY host
| eval current=now()
| eval delay1=current-lastTime
| where delay1 > 300
| dedup host
| lookup Checkpoint-Hosts-122115.csv host-pri AS host OUTPUT host-bak
| rename host AS host-pri
| table host-pri delay1 host-bak

host-pri            delay1    host-bak
go-bldxfwbpz-bak    675       go-bldxfwbpz-pri
go-bldxfwbpz-pri    3482     go-bldxfwbpz-bak
go-bldxfwe-bak     4023      go-bldxfwe-pri
0 Karma
1 Solution

javiergn
Super Champion

Let me see if I get this right. Based on your requirements above and the example provided, you would like to be alerted when both primary and backup are down, and therefore you would expect an alert for the following two hosts correct?

host-pri             delay1    host-bak
go-bldxfwbpz-bak     675       go-bldxfwbpz-pri
go-bldxfwbpz-pri     3482      go-bldxfwbpz-bak

In that case, if you use a subsearch first to return the backup hosts matching your expression above, then use the main search to filter when delay > 300 for those backup hosts and finally using a different lookup that returns your primary from your backup, then I think you should be all right. Something like:

| tstats latest(_time) AS lastTime WHERE index=checkpoint host=* BY host
| search [
   | tstats latest(_time) AS lastTime WHERE index=checkpoint host=* BY host
   | eval current=now()
   | eval delay1=current-lastTime
   | where delay1 > 300
   | dedup host
   | lookup Checkpoint-Hosts-122115.csv host-pri AS host OUTPUT host-bak
   | rename host AS host2
   | rename host-bak AS host
   | return host
]
| eval current=now()
| eval delay1=current-lastTime
| where delay1 > 300
| dedup host
| lookup Checkpoint-Hosts-122115.csv host-bak AS host OUTPUT host-pri
| rename host AS host-bak
| table host-pri host-bak

Let me know if that works.

Thanks,
J

View solution in original post

javiergn
Super Champion

Let me see if I get this right. Based on your requirements above and the example provided, you would like to be alerted when both primary and backup are down, and therefore you would expect an alert for the following two hosts correct?

host-pri             delay1    host-bak
go-bldxfwbpz-bak     675       go-bldxfwbpz-pri
go-bldxfwbpz-pri     3482      go-bldxfwbpz-bak

In that case, if you use a subsearch first to return the backup hosts matching your expression above, then use the main search to filter when delay > 300 for those backup hosts and finally using a different lookup that returns your primary from your backup, then I think you should be all right. Something like:

| tstats latest(_time) AS lastTime WHERE index=checkpoint host=* BY host
| search [
   | tstats latest(_time) AS lastTime WHERE index=checkpoint host=* BY host
   | eval current=now()
   | eval delay1=current-lastTime
   | where delay1 > 300
   | dedup host
   | lookup Checkpoint-Hosts-122115.csv host-pri AS host OUTPUT host-bak
   | rename host AS host2
   | rename host-bak AS host
   | return host
]
| eval current=now()
| eval delay1=current-lastTime
| where delay1 > 300
| dedup host
| lookup Checkpoint-Hosts-122115.csv host-bak AS host OUTPUT host-pri
| rename host AS host-bak
| table host-pri host-bak

Let me know if that works.

Thanks,
J

javiergn
Super Champion

Note: even if my search above works fine it would probably require some tweaking. The variable current should be evaluated once and not twice for instance. You might also want to filter out duplicated results from the final output.

0 Karma

Thuan
Explorer

Good morning Javiern,
It works ! I will do some tweaking as your recommended.
Thank you a great deal!

0 Karma

javiergn
Super Champion

No worries. I'm glad it worked.

Please don't forget to mark it as answered if you liked it so that others can benefit from it.

0 Karma
Get Updates on the Splunk Community!

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...