Splunk Enterprise

ArtifactReplicator - Replication connection to ip=10.164.196.166:8999 timed out

PramodhKumar
Explorer

Hi Chaps,

We are having an issue where Searches are delayed at SHC Captain following upgrade from 7x to 8x.

There are verity of errors where some related to Artifact Replication, some on SHCMaster/SLave, HttpListener, etc..

Note: replication and all other require ports are open at all instances....

SHCSlave:

06-18-2020 12:31:44.413 +0100 INFO SHCSlave - event=SHPSlave::handleReplicationError aid=scheduler__admin__nmon__RMD51d5d480c7c4e780c_at_1592479800_2330_644D578C-F001-4711-B459-2338E22DF399 src=644D578C-F001-4711-B459-2338E22DF399 tgt=638683B3-25D9-4D2A-AF2E-4E43362FDBFA failing=644D578C-F001-4711-B459-2338E22DF399 queued replication error job

 

SHCMaster:

06-18-2020 12:48:04.200 +0100 INFO SHCMaster - event=SHPMaster::handleReplicationError replication error src=644D578C-F001-4711-B459-2338E22DF399 tgt=638683B3-25D9-4D2A-AF2E-4E43362FDBFA failing=src aid=scheduler_c3ZjX3N1bW1hcmlzZXI_ZWVfc2VhcmNoX2NhbA__RMD579a1d02cbad79018_at_1592480820_2426_644D578C-F001-4711-B459-2338E22DF399

 

DispatchManager: This component has highest number of events and has no events before up- gradataion

06-18-2020 12:59:41.537 +0100 WARN DispatchManager - enforceQuotas: username="apasha", search_id="apasha__apasha_ZWVfc2VhcmNoX3BjcmY__search3_1592481563.5052_644D578C-F001-4711-B459-2338E22DF399" - QUEUED reason="The maximum number of concurrent historical searches for this user based on their role quota has been reached.", concurrency_limit="2"

The above log looks like an Ad-hoc search and it says, QUEUED. what would be the reason? any help is highly appreciated... Thanks. see below logs too....

 

ArtifactReplicator:

06-18-2020 12:32:41.201 +0100 WARN ArtifactReplicator - event=artifactReplicationFailed type=ReplicationFiles files="/opt/splunk/var/run/splunk/dispatch/_splunktemps/send/s2s/scheduler_c3ZjX3N1bW1hcmlzZXI_ZWVfc2VhcmNoX2NhbA__RMD527b16720760c2872_at_1592479740_2005_638683B3-25D9-4D2A-AF2E-4E43362FDBFA-644D578C-F001-4711-B459-2338E22DF399.tar" guid=644D578C-F001-4711-B459-2338E22DF399 host=10.164.196.166 s2sport=8999 aid=4229. Connection failed
host = searchcr01source = /opt/splunk/var/log/splunk/splunkd.logsourcetype = splunkd
6/18/20
12:32:41.200 PM
06-18-2020 12:32:41.200 +0100 WARN ArtifactReplicator - Connection failed
host = searchcr01source = /opt/splunk/var/log/splunk/splunkd.logsourcetype = splunkd
6/18/20
12:32:41.200 PM
06-18-2020 12:32:41.200 +0100 WARN ArtifactReplicator - Replication connection to ip=10.164.196.166:8999 timed out

@gcusello  - Can you please help me.. 🙂

Regards,

Pramodh

Labels (2)
Tags (2)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @PramodhKumar ,

I don't think that it's possible to solve this problem without accessing the system and analyzing logs and configurations, so My hint is to open a Ticket to Splunk Support so they can access the systems and quicky solve the problem.

Opening the ticket send them the diags of all the SHC members and Deployer.

Ciao.

Giuseppe

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.