Why is the searchhead captain skipping some search...

ykpramodhcbt · ‎12-22-2017

Hi Splunkers,

We have a Search Head Cluster with 3 search heads. We have 70 searches that are supposed to run every minute.

We find that 14-15% of searches are getting skipped on SH Captain. We tried to change the captain and observed the same phenomenon on new captain too. We do not have any SH designated for ad-hoc searches.

Please find the image below where other search heads are not experiencing any skip. Also, note that SH captain is taking higher number of searches.

Please let us know if there is a way to get around this.

DavidHourani · ‎12-23-2017

Hi ykpramodhcbt,

You're running 70 searches per minute on 3 SH. What are the specs on those SH's ? Are you having those skipped event throughout the day or simply during peek hours ?

Try running this search to see at what time you're getting the most drops :
index=_internal sourcetype=scheduler status=skipped |timechart span=1min count

It could be that you simply need more CPU cores on your SH's to handle all the load.

Regards,
David

MonkeyK · ‎12-23-2017

interesting. Not my question, but if one SH is skipping, how do we check for what is creating load on that SH? Or generally what to look at to reduce load on the overloaded SH?

DavidHourani · ‎12-23-2017

maybe this answer can help you find what is causing the load : https://answers.splunk.com/answers/583285/how-to-list-ad-hocscheduled-searches-in-order-of-c.html

ykpramodhcbt · ‎12-23-2017

We made the following settings -

Designated sh2 as captain, ad-hoc search head
sh3 - ad-hoc search head.

Our intent is to push all scheduled searches to sh1 and monitor the performance. We will keep you apprised on our observation.

ykpramodhcbt · ‎12-23-2017

Hi DavidHourani,

Thanks for your note.

Our Three SHs have 36 CPU each. The skipping is seen only observed on Search Head Captain. We are observing skipping throughout the day.

We have setup max_searches_per_cpu to 4 on all SHs already. (This is taking us closer to brink though)

DavidHourani · ‎12-23-2017

The SHC handles scheduling and dispatching so it consumes more CPU than the other instances. You're having the problem on the SHC regardless of which host is the captain right ?

ykpramodhcbt · ‎12-23-2017

Yes DavidHourani.

We have identical configuration on all the SHs

nikita_p · ‎12-23-2017

Hi @ykpramodhcbt,
You can check expected answer in below link. This might help you.
https://answers.splunk.com/answers/514181/skipped-searches-on-shc.html

Why is the searchhead captain skipping some searches?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Accelerating Observability as Code with the Splunk AI Assistant

Join the Conversation

Why is the searchhead captain skipping some searches?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Accelerating Observability as Code with the Splunk AI Assistant