We have had a problem over the weekend when one of our alerts did not trigger. I had to restart the services to get it all working again.
Does anyone had any idea why this might have happened?
It's possible it was related to changes we had made. It's the second time in a week we've needed to restart the services to have changes start working.
In the end it appeared that the splunk server was skipping triggering as apparently there is a limit to 1 real time alert per CPU core.
We increased this and it mostly fixed the issues.
In the end it appeared that the splunk server was skipping triggering as apparently there is a limit to 1 real time alert per CPU core.
We increased this and it mostly fixed the issues.
What changes did you make and how (deployed/updated conf files/from UI)?
I create a new lookup table, added the new fields to search and also to the email alert that went out.
I should say i tested it and it was working 11.55pm on friday. Then nothing for the rest of the weekend
Did you check the scheduler logs for whether the alert search was run and if there were results that would trigger the alert?
index=_internal sourcetype=scheduler savedsearch_name="YourAlertName"
Thanks for this, yes it shows that it ran 1441 times on that day. Meaning it ran every minuet in the day so all working well..
Also if i run the search that the alert is built on the event shows up so i know the criteria was met.
This morning i sent a test alert no email
restarted services
sent another test alert and it worked.