Deployment Architecture

Why am I getting "Missing enough suitable candidates to create a replicated copy" building a multisite indexer cluster staging environment?

SplunkTrust
SplunkTrust

Hello,

I have built a Splunk testing / staging environment on top of 6 VMs. Splunk version is 6.2.3 and us running on CentOS 6.6.
I have 1 Search Head, 1 Utilities Server (Cluster Master / Deployment Server) and 4 Indexers (Multi Site Cluster).

The cluster has come up and is showing all hosts within the management console.

Search factor is set to 2 and replication factor is set to 3.
The only indexes are _audit and _internal. Searchable data is fine 2/2 but replicated data is 2/3.
Bucket status is showing 8x fixup tasks - pending. I have rebooted the cluster master and peer nodes (rolling restart).

If I view the bucket statuses I get:

Missing enough suitable candidates to create a replicated copy in order to meet replication policy. Missing={ site2:1 }

Any help or advice would be appreciated.

0 Karma
1 Solution

SplunkTrust
SplunkTrust

As it's a new build I just cleaned out the indexes and now the issue has gone away:
splunk stop
splunk clean eventdata -index _internal -f
splunk clean eventdata -index _audit -f
splunk start

View solution in original post

Engager

We received the same bucket status as you described on most of our pending fixup actions, both on search factor and replication factor, respectively.

Search factor:

Missing enough suitable candidates to create searchable copy in order to meet replication policy. Missing={ site3:1 }

Replication factor:

Missing enough suitable candidates to create a replicated copy in order to meet replication policy. Missing={ site3:1 }

In our case, we verified all our configurations were correct per the link posted by dxu and elsewhere in the documentation and then just had to wait it out. The status did not go away until our pending fixup actions got into the ~300 range (down from ~30k originally).

So in short, it resolved itself. This was on Splunk 7.3.

0 Karma

Path Finder

I got a similar issue with 7.3.3 when migrating from single-site to multisite. I found that reducing replication_factor and seach_factor helped the process faster.

0 Karma

Contributor

Any solution for this issue in production for single site clustered environment. Kindly advise.

0 Karma

SplunkTrust
SplunkTrust

As it's a new build I just cleaned out the indexes and now the issue has gone away:
splunk stop
splunk clean eventdata -index _internal -f
splunk clean eventdata -index _audit -f
splunk start

View solution in original post

Explorer

This is a horrible idea. Deleting data is not the proper way to resolve SF/RF issues.

0 Karma

Contributor

I downvoted this post because data loss.

0 Karma

Communicator

I downvoted this post because i downvoted this post because i can't remove all data to fix my data issue

0 Karma

Contributor

I downvoted this post because removed data to fix data issue is not a answer.

0 Karma

SplunkTrust
SplunkTrust

I downvoted this post because this isn't a realistic answer and should very clearly state that all data in that index will be destroyed.

0 Karma

Explorer

I downvoted this post because wiping all data doesn't solve the problem

Contributor

I downvoted this post because i cant remove all data to fix my data issue

Motivator

wiping all data isn't really a realistic answer

SplunkTrust
SplunkTrust

The trigger condition is showing:
"Removing peer "

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!