I'm configuring an alert for changes in EIGRP neighbor adjacency. I've configured a field extraction that defines the fields:
I'm using the transaction command to correlate the "down" and "up" messages for a given host, interface, and neighbor.
The alert has multiple conditions. Here's the logic:
IF the transaction isn't closed (i.e., no "up" message received) and the state is "down" --> Alert
IF the transaction is closed and the duration (i.e, the downtime) was greater than 30 seconds --> Alert
Here's the search string:
index=network NBRCHANGE | transaction host eigrp_interface eigrp_neighbor startswith=eigrp_state="down" endswith=eigrp_state="up" keepevicted=true | eval eigrp_alert=if((closed_txn=0 AND eigrp_state="down") OR (closed_txn=1 AND duration>30),1,0) | search eigrp_alert=1
This works. I want to add one more condition to the alert if an interface is "flapping". In other words, if more than x "down" messages are seen for the same neighbor within a period of time, alert. I can't figure out how to add this logic.
First, how often do you run the alerting search? Over what time range? You might want to add maxspan=2m to your transaction to limit the amount of time taken for the search. How many devices are you searching across? What is the total time range for the transaction search?
Second, you might just want another search to look for flapping that does a sourcetype=router down | stats count by host and alert when count > 20 over a 2 min window or something like that. You could even break it down by port as well and then map mac address to IP and know which link is having problems.
The search will run every 5 minutes. There are probably around 2,000 devices in the "network" index. What do you mean by the total time range for the transaction search?
I think the suggestion for a second search is probably the best way on this.
every 5 min is what i was looking for. Has the second search worked out? Hope it is all working for you now.
You might want to try and do an eval for your state:
index=network NBRCHANGE |stats count(eval(eigrp_state=="down")) AS DOWN, count(eval(eigrp_state=="up")) AS UP by eigrp_interface | where down > 5
I hope that helps and gets you closer to your answer.