Splunk Enterprise

SearcH Head WARN- ConfReplicationThread & ConfMetrics

venkateshparank
Path Finder

Can anyone help why we are seeing these WARN in logs and how to fix permanently.

We are performing manual resync whenever count of events is > 5 in 15mins time range using below query:

index=_internal host=searchhead* component=ConfReplicationThread log_level=WARN "Cannot accept push" |bin span=15m _time|stats max(consecutiveErrors) as count by host,_time|where count>5

LOGS:

==========

WARN ConfReplicationThread - Error pushing configurations to captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in acceptPush: Non-200 status_code=400: ConfReplicationException: Cannot accept push with outdated_baseline_op_id=8d89fca5ef4520b00b8ffe8b1366a178b92b52fb; current_baseline_op_id=a948f0e3f0fcae707ce37ca7d7a73"

ConfReplicationThread - Error pushing configurations to captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in acceptPush: Non-200 status_code=400: ConfReplicationException: Cannot accept push with outdated_baseline_op_id=66098bdc22c2bcacf951fb104558db365ac64820; current_baseline_op_id=085e675e4c9d8c9fafabee"

ConfReplicationThread - Error pulling configurations from captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in fetchFrom, at=a6a747e7138353bd07873f04fe90f2c9b4564567: Network-layer error: Connect Timeout"

ConfReplicationThread - Error pulling configurations from captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in fetchFrom, at=a6a747e7138353bd07873f04fe90f2c9b4564567: Network-layer error: Connect Timeout"

ConfReplicationThread - Error pulling configurations from captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in fetchFrom, at=a6a747e7138353bd07873f04fe90f2c9b4564567: Network-layer error: Connect Timeout"

=============

Even tried to reduce the max_push count to 50 (default is 100). How can we resolve this permanently ?

==============

WARN ConfMetrics - single_action=PUSH_TO took wallclock_ms=1525! Consider a lower value of conf_replication_max_push_count in server.conf on all members.
WARN ConfMetrics - single_action=PUSH_TO took wallclock_ms=2644! Consider a lower value of conf_replication_max_push_count in server.conf on all members.
WARN ConfMetrics - single_action=PULL_FROM took wallclock_ms=2011! Consider a lower value of conf_replication_max_pull_count in server.conf on all members.
WARN ConfMetrics - single_action=PULL_FROM took wallclock_ms=1778! Consider a lower value of conf_replication_max_pull_count in server.conf on all members.

 

Below are the settings in server.conf

conf_replication_max_push_count = 50
conf_replication_purge.period = 3h
conf_replication_period = 10

 

we do not want to do resync everytime.

splunk resync shcluster-replicated-config

Tags (2)
0 Karma

Nisha18789
Builder

hi @venkateshparank , please check this resolved thread, it seems similar to what you are facing

https://community.splunk.com/t5/Deployment-Architecture/Search-Head-Cluster-Error-acceptPush-non-200...

 

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!