Hi all,
I'm looking to trigger an alert if our DHCP server loses connection with its partner DHCP for more than 30 minutes.
When the server loses connectivity we get "EventCode=20255" in the logs.
This happens fairly often due to patching, but the server would always be back within 30 minutes, so we shouldn't get an alert in that case.
When the connection is restablished we get "EventCode=20254"
The question is, how would I trigger an alert if more than 30 minutes elapses before "EventCode=20254"?
Here's one way, although perhaps not the most performant.
index=foo (EventCode=20254 OR EventCode=20255)
| transaction startswith="EventCode=20255" endswith="EventCode=20254
| where duration > 1800
The transaction command looks for the events in the expected order and calculates how long it was between them, putting the result into the duration field. The where command produces results only if the events are at least 1800 seconds (30 minutes) apart. All that's left is to trigger the alert if the number of results is not zero.
Thanks. Will that still alert if "EventCode=20254" hasn't been triggered?
No, it won't.
Here's another query that should work better. It looks for the most recent EventCode and triggers if it's Down event more than 30 minutes old.
index=foo (EventCode=20254 OR EventCode=20255)
| dedup EventCode
| where (EventCode=20255 AND (now() - _time) > 1800)