We have set up a clustered Splunk enterprise environment, and we have recently seen multiple scheduled searches getting skipped, with the ratio being observed varying from 80% to 99%.
Upon scrolling through the forum, we have observed that we can modify these values via limits.conf file.
My question is to confirm whether changing the concurrency value or search allocation quota for scheduled searches is advised as a best practice, or would that have some repercussions in the long run?
Also, what could be considered as a quick remedial action to mitigate this issue?
Searches are skipped because there are more searches than there are resources available to run them. They also can be skipped if an instance of the same search is already running. Use the Monitoring Console to determine which is the case.
If a search is skipped because it's already running then either the search is inefficient and runs too long or it's scheduled to run too often. Review the SPL to make it perform better and/or change the interval to allow the first instance to complete before another starts.
If searches are skipped because of a lack of resources then you may have too many searches scheduled to run at the same time. Watch out for a large number of searches scheduled to run at the top of the hour and reschedule them to other times.