I've been banging my head up against the wall for the last day or so trying to figure out why my alerts aren't firing off any more. I'm on Splunk Version 7.1.2. All the alerts worked fine up until now. I haven't done any new configurations so it's odd that it stopped working. The environment maybe a tad bit overloaded so as troubleshooting steps i've scaled back just about everything yet it's still not working. There are only 2 servers within this setup (Search head and indexer) Neither can send email. These servers are Virtual and i notice the issue one day when I tried to perform a search and the indexer was froze up. I logged into vsphere and rebooted it and all was well...
Before you go down the line of things to check here's what i've tried
I've disabled all alerts except 1 test alert, i've disabled / turned down anything else on that server that could be competing for resources, I've looked into the Monitoring Console, i've checked search activity, skip search ratio, looked into every link within the Monitoring console and all looks healthy.
To rule out this being an SMTP issue i tried testing and alert from the command line and THAT WORKS!
*| top 5 host | sendmail to="myemail.com"
^ The email comes through every time. However with my set alerts aren't working. For each alert i have it send email + Add to Triggered alerts
1) It's not sending the SMTP alert
2) It's stating "There are no fired events for this alert" however when you perform the search behind the alert you clearly see it should be working..
I looked into the python logs and didn't see anything bad there. I'd really appreciate some solid help with this one. Thanks
So i cloned an alert and then shortly after the cloned alert Actually fired off an email once and showed as adding it to the triggered alerts! Could my system be overloaded? The skip ratio is 0 and nothing else incredibly offensive popped out to me. What could i check to see if it's choking the alerts from firing off?
I tried: index=_internal search_type=scheduled
when i click on "alert _actions" summary_index is 81% of all alerts, 18% appears to be ITSI Related and only .087% is email (count of 3)
in my mind i'm thinking it could be a performance issue due to all these saved searches but i'm not seeing it in the monitoring console. I guess i don't know where to look or what to do to stop the summary index from hoggin all saved search real estate