Splunk Enterprise

ConfReplicationThread & ConfMetrics WARN on SearchHeads

venkateshparank
Path Finder

Can anyone help why we are seeing these WARN in logs and how to fix permanently.

We are performing manual resync whenever count of events is > 5 in 15mins time range using below query:

index=_internal host=searchhead* component=ConfReplicationThread log_level=WARN "Cannot accept push" |bin span=15m _time|stats max(consecutiveErrors) as count by host,_time|where count>5

LOGS:

==========

WARN ConfReplicationThread - Error pushing configurations to captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in acceptPush: Non-200 status_code=400: ConfReplicationException: Cannot accept push with outdated_baseline_op_id=8d89fca5ef4520b00b8ffe8b1366a178b92b52fb; current_baseline_op_id=a948f0e3f0fcae707ce37ca7d7a73"

ConfReplicationThread - Error pushing configurations to captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in acceptPush: Non-200 status_code=400: ConfReplicationException: Cannot accept push with outdated_baseline_op_id=66098bdc22c2bcacf951fb104558db365ac64820; current_baseline_op_id=085e675e4c9d8c9fafabee"

ConfReplicationThread - Error pulling configurations from captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in fetchFrom, at=a6a747e7138353bd07873f04fe90f2c9b4564567: Network-layer error: Connect Timeout"

ConfReplicationThread - Error pulling configurations from captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in fetchFrom, at=a6a747e7138353bd07873f04fe90f2c9b4564567: Network-layer error: Connect Timeout"

ConfReplicationThread - Error pulling configurations from captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in fetchFrom, at=a6a747e7138353bd07873f04fe90f2c9b4564567: Network-layer error: Connect Timeout"

=============

Even tried to reduce the max_push count to 50 (default is 100). How can we resolve this permanently ?

==============

WARN ConfMetrics - single_action=PUSH_TO took wallclock_ms=1525! Consider a lower value of conf_replication_max_push_count in server.conf on all members.
WARN ConfMetrics - single_action=PUSH_TO took wallclock_ms=2644! Consider a lower value of conf_replication_max_push_count in server.conf on all members.
WARN ConfMetrics - single_action=PULL_FROM took wallclock_ms=2011! Consider a lower value of conf_replication_max_pull_count in server.conf on all members.
WARN ConfMetrics - single_action=PULL_FROM took wallclock_ms=1778! Consider a lower value of conf_replication_max_pull_count in server.conf on all members.

 

Below are the settings in server.conf

conf_replication_max_push_count = 50
conf_replication_purge.period = 3h
conf_replication_period = 10

 

we do not want to do resync everytime.

splunk resync shcluster-replicated-config

Labels (3)
0 Karma

thambisetty
Super Champion

Based on warn one of your search member is in out of sync. The search member is trying to update search head captain about the change made on search member.

Did you try initiating search member rolling restart?

if not, try restarting your search head cluster.

————————————
If this helps, give a like below.
0 Karma

venkateshparank
Path Finder

Yes, Search Head restart and Resyn manually has been done already.

Still seeing same repetative warnings

0 Karma
Get Updates on the Splunk Community!

The Splunk Success Framework: Your Guide to Successful Splunk Implementations

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...