We have a Search Head Cluster with 3 search heads. We have 70 searches that are supposed to run every minute.
We find that 14-15% of searches are getting skipped on SH Captain. We tried to change the captain and observed the same phenomenon on new captain too. We do not have any SH designated for ad-hoc searches.
Please find the image below where other search heads are not experiencing any skip. Also, note that SH captain is taking higher number of searches.
Please let us know if there is a way to get around this.
You're running 70 searches per minute on 3 SH. What are the specs on those SH's ? Are you having those skipped event throughout the day or simply during peek hours ?
Try running this search to see at what time you're getting the most drops :
index=_internal sourcetype=scheduler status=skipped |timechart span=1min count
It could be that you simply need more CPU cores on your SH's to handle all the load.
We made the following settings -
Our intent is to push all scheduled searches to sh1 and monitor the performance. We will keep you apprised on our observation.
Thanks for your note.
Our Three SHs have 36 CPU each. The skipping is seen only observed on Search Head Captain. We are observing skipping throughout the day.
We have setup max_searches_per_cpu to 4 on all SHs already. (This is taking us closer to brink though)