only replicating main index

sankareds · ‎08-22-2019

Hi,

I'm getting the below error and the hot buckets are not replicated across the indexers in a cluster.

08-23-2019 04:20:05.197 +0000 WARN BucketReplicator - Failed to replicate Streaming bucket bid=main~337~BB922979-04B2-49E0-AE94-B75588520776 to guid=EFEA22B6-1EAC-4505-8B01-DD2A57666CD7 host=splunk-indexer-b.xxx.svc.cluster.local s2sport=9887. Connection failed

When I do the telnet from indexer-a to b, the connection closed immediately, However if do the telnet from the localhost the connection is established

root@splunk-indexer-a:~# telnet splunk-indexer-b.ns.svc.cluster.local 9887
Trying 172.20.208.244...
Connected to splunk-indexer-b.ns.svc.cluster.local.
Escape character is '^]'.
Connection closed by foreign host.

Thanks.

justynap_ldz · ‎01-22-2021

Could you please share some details on how exactly you fixed network connectivity issues to solve WARN BucketReplicator - Failed to replicate Streaming bucket?
We have exactly the same issue.

Your help will be much appreciated!

richgalloway · ‎08-23-2019

You have connectivity problems between your indexers. Check your firewalls.

---
If this reply helps you, Karma would be appreciated.

sankareds · ‎08-23-2019

Hi Richgalloway,

yup, I've fixed the connectivity issue, and now I'm having different issue.

The search factor(2) and the replication factor(2) is not met always, I have 3 indexers in my cluster and all the connections are looks good. One thing that I noticed was, there is always some fixup tasks running on the bucket status ui, and the Fixup reason showing as "streaming failure - src=B621E78A-369C-42CC-B604-000174F54036 tgt=BB922979-04B2-49E0-AE94-B75588520776 failing=src"

After some time the fixup tasks completed succesfully, but it appears again for new buckets and so on. Because of this behavior I've always see "Data Durability" warning on the master. I've recently activated the smartstore in our cluster and I'm not sure that's causing this issue.

Thanks.

jkat54 · ‎08-24-2019

Please share your server.conf and inputs.conf.

sankareds · ‎08-25-2019

Hi,

Please find the required config files. Also i don't have any customized inputs.conf and using the defaults only.

Adding few more points, I have 3 indexers cluster, and today I reduced rep/search factor to 2 and its keeping the data durability green longer time than before(3sf/3rf). One of the peer is always lags behind other two in order to catch up with hot buckets.

Is there a way that we can set wider time window to sync the hot buckets?

cluster master [server.conf]

[clustering]
cluster_label = idxc_label
mode = master
pass4SymmKey =
search_factor = 3
replication_factor = 3
rebalance_threshold = 0.9
percent_peers_to_restart = 100

indexers [server.conf]

[clustering]
master_uri = https://splunk-master.splunk.svc.cluster.local:8089
mode = slave
pass4SymmKey =
register_replication_address = splunk-indexer-1.splunk.svc.cluster.local
max_replication_errors = 50

indexers [indexes.conf]

maxGlobalDataSizeMB = 50000

[default]
remotePath = volume:remote_store/$_index_name
repFactor = 0

only replicating main index

[main]
repFactor = auto

[_introspection]
repFactor = 0

[_audit]
repFactor = 0

[_internal]
repFactor = 0

[_telemetry]
repFactor = 0

[volume:remote_store]
storageType = remote
path = s3://splunk-smart-store-bucket

MaroofaTanweer · ‎01-07-2022

hi, I had same problem . I changed two things of server.conf file of all indexers :

from,

master_uri = https://splunk-master.splunk.svc.cluster.local:8089
mode = slave

to,

manager_uri = https://splunk-master.splunk.svc.cluster.local:8089
mode = peer

and restart the splunk. these changes worked for me.

justynap_ldz · ‎01-22-2021

@sankareds Could you please share some details on how exactly you fixed network connectivity issues to solve WARN BucketReplicator - Failed to replicate Streaming bucket?
We have exactly the same issue.

Your help will be much appreciated!

sankareds · ‎01-22-2021

Hi @justynap_ldz

If I remember correctly, I've added the below ansible task to register the replication address. This way indexers can trust each other replication requests.

---
- include_tasks: ../../../roles/splunk_common/tasks/wait_for_splunk_instance.yml
  vars:
    splunk_instance_address: "{{ cluster_master_host }}"

- name: Set current node as indexer cluster peer
  command: "{{ splunk.exec }} edit cluster-config -register_replication_address $HOSTNAME.splunk.svc.cluster.local -mode slave -master_uri '{{ cert_prefix }}://{{ cluster_master_host }}:{{ splunk.svc_port }}' -replication_port {{ splunk.idxc.replication_port }} -secret '{{ splunk.idxc.secret }}' -auth '{{ splunk.admin_user }}:{{ splunk.password }}'"
  become: yes
  become_user: "{{ splunk.user }}"
  register: task_result
  changed_when: task_result.rc == 0
  until: task_result.rc == 0
  retries: "{{ retry_num }}"
  delay: 3
  ignore_errors: yes
  notify:
    - Restart the splunkd service
  no_log: "{{ hide_password }}"

As per the Splunk documentation:

register_replication_address = <IP address or fully qualified machine/domain name>
* Only valid for 'mode=peer'.
* This is the address on which a peer is available for accepting
  replication data. This is useful in the cases where a peer host machine
  has multiple interfaces and only one of them can be reached by another
  splunkd instance

If you are not using splunk-ansible, you could set this value directly in the server.conf.

https://docs.splunk.com/Documentation/Splunk/8.1.1/Admin/Serverconf

Failed to replicate Streaming bucket

cluster master [server.conf]

indexers [server.conf]

indexers [indexes.conf]

only replicating main index

Index This | Why did the turkey cross the road?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Feel the Splunk Love: Real Stories from Real Customers

Are you a member of the Splunk Community?

Failed to replicate Streaming bucket

cluster master [server.conf]

indexers [server.conf]

indexers [indexes.conf]

only replicating main index

Index This | Why did the turkey cross the road?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Feel the Splunk Love: Real Stories from Real Customers