Splunk Enterprise

Why are scheduled searches getting skipped due to rolling restart?

shadysplunker
Explorer

We have an issue where all the scheduled searches are getting skipped whenever rolling restart is in progress. 

Also, since few weeks, we have observed that the cluster master automatically initiates a rolling restart of the indexers, twice in a week. It takes about 24 hours to restart all the 24 indexers in a cluster, which impacts our business too. 

Has anyone ever incurred this situation before? 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Master should _not_ initiate restart out of thin air. There must be something that triggers it.

Also see https://docs.splunk.com/Documentation/Splunk/latest/Indexer/Userollingrestart#How_searchable_rolling...

The timeouts and wait times can impact whether your searches get skipped.

shadysplunker
Explorer

Thanks for pointing me out to a proper direction!

 

Here's our config from [clustering] stanza:

[clustering]
mode = master
multisite = true
available_sites = site1, site2
site_replication_factor = origin:1, site1:1, site2:1, total:3
site_search_factor = origin:1, site1:1, site2:1, total:2
cluster_label = cluster1
maintenance_mode = false
max_peers_to_download_bundle = 10
service_interval = 10
heartbeat_timeout = 1800
cxn_timeout = 300
send_timeout = 300
rcv_timeout = 300
max_peer_build_load = 5
rolling_restart = searchable
restart_timeout = 500
decommission_force_timeout = 900
restart_inactivity_timeout = 1500
rebalance_threshold = 0.96
max_auto_service_interval = 250

 

I suspect few things here like "rolling_restart" should be "searchble_force" and max_peers_to_download_bundle would be more than 10? considering we have 24 indexers. 

 

I will go through all these parameters and understand it in detail.

Do you suspect anything unusual in the configuration here? It would be much helpful! Thanks!

0 Karma
Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Word Search

This challenge was first posted on Slack #puzzles channelThis puzzle is based on a letter grid containing ...

[Puzzles] Solve, Learn, Repeat: Advent of Code - Day 4

Advent of CodeIn order to participate in these challenges, you will need to register with the Advent of Code ...

GA: S3 Promote for Historical Data Ingestion in Splunk Cloud

Ingest Historical S3 Data On-Demand: Announcing the General Availability of S3 Promote We’re excited to share ...