Splunk Search

[8.0.6] Scheduler stops dispatching scheduled searches intermittently.

sylim_splunk
Splunk Employee
Splunk Employee

Our Splunk  SH cluster scheduler stopping, users complaining that alerts/scheduled reporting not running or processing. We disabled and enabled the scheduler on the captain but that didnt work. We decided to switch captaincy to another and that worked - scheduling/processing resumed.
Today we had reoccurrence but on a different Search head cluster - we switched captains and that remediated issue again. 

Version 8.0.6 and recent change - cascading bundle replication enabled around a month ago.

Labels (1)
Tags (1)
1 Solution

sylim_splunk
Splunk Employee
Splunk Employee

Once scheduler stopped scheduling jobs and stuck with something it will not put logs in scheduler.log at all on the captain - which can be used for us to confirm it must be stuck with something.

Pstack collection would be helpful to address the issue.

After inspection on the pstack collected when the issue was noticed, it turned out to be caused by a deadlock situation by cascading bundle push and the fix is available in 8.0.7+ and 8.1.3+.

If you would like to confirm it is the case please open a support case with a diag and pstack outputs collected when it is outstanding/before restarting of splunk. SPL-200260.

View solution in original post

sylim_splunk
Splunk Employee
Splunk Employee

Once scheduler stopped scheduling jobs and stuck with something it will not put logs in scheduler.log at all on the captain - which can be used for us to confirm it must be stuck with something.

Pstack collection would be helpful to address the issue.

After inspection on the pstack collected when the issue was noticed, it turned out to be caused by a deadlock situation by cascading bundle push and the fix is available in 8.0.7+ and 8.1.3+.

If you would like to confirm it is the case please open a support case with a diag and pstack outputs collected when it is outstanding/before restarting of splunk. SPL-200260.

Get Updates on the Splunk Community!

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...

Getting Started with AIOps: Event Correlation Basics and Alert Storm Detection in ...

Getting Started with AIOps:Event Correlation Basics and Alert Storm Detection in Splunk IT Service ...

Register to Attend BSides SPL 2022 - It's all Happening October 18!

Join like-minded individuals for technical sessions on everything Splunk!  This is a community-led and run ...