Archive

Alert scheduler not scheduling alert as per configuration

ksubramanian198
Engager

Hi Every one,
I have configured an alert using cron expression (*/1 * * * *) schedule to run for every one minute. After saving the alert the next scheduled time is updated with current time +1 min. After executing the job after 1 minute, the next scheduled time is supposed to update for the next minute which is not happening. It is working fine until yesterday and suddenly this issue has occurred for all alerts.

Please help to resolve this friends.

Tags (2)
0 Karma

Arpit_S
Path Finder

You can always use * * * * * to schedule any search to run every minute, but in this case whenever search takes more than 1 minute to run it will start skipping the next run. It always better to keep a safe time interval between the searches.

0 Karma

ksubramanian198
Engager

The results are not skipped, In my point of view the scheduler is hung and the next schedule time is not getting updated for alerts 😞
That is why my results are zero while using the following command "index=_internal sourcetype=scheduler app="postilion*" | timechart count by status"

0 Karma

ksubramanian198
Engager

Next scheduled time stops at this time 2018-08-14 17:00:00 CDT and until now it is not updated Check the splunkd logs no error from my applications

0 Karma

burwell
SplunkTrust
SplunkTrust

I have alerts on my Monitoring console for skipped searches. I alert to Slack.

I wrote about finding skipped searches: https://answers.splunk.com/answers/514181/skipped-searches-on-shc.html

Paul Lucas gave a great talk at .conf17 on the new scheduler (as well as at our SF Bay area splunk usergroup on August 8): https://conf.splunk.com/files/2017/slides/making-the-most-of-the-splunk-scheduler.pdf The skew feature looks very useful.

0 Karma

andreacorvini
Path Finder

Did you try with "* * * * *" (five asterisk) in cron definition?

0 Karma

ksubramanian198
Engager

Yes, I have tired using 5* and "*/1 * * * *" as cron definition

0 Karma

skoelpin
SplunkTrust
SplunkTrust

So it was working yesterday and stopped working today? I'm willing to bet your searches are skipping because your hardware can't handle the load. You should look at the internal index to verify this

index=_internal sourcetype=scheduler status="skipped"

0 Karma

sudosplunk
Motivator

Agreed. Running alert every minute can sometimes use as much resources as real-time searches do. Also, what is the search you're using for this alert?

0 Karma

ksubramanian198
Engager

Hi,
I have checked this using the following command "index=_internal sourcetype=scheduler app="postilion*" | timechart count by status" for last 7 days, My results are 0 after 10th August'18 for skipped, continued and success status

0 Karma

brian_rampley
Path Finder

Just to make sure we are checking everything, what does your Splunk environment look like? If you're running distributed, are you forwarding your search head logs to your indexers?

0 Karma