Deployment Architecture
Highlighted

How to resolve bucket replication errors such as "Replication failed due to open failure" occurring in search peer?

Communicator

Hi all,

We're running an indexer cluster on 6.5.1. We found error in splunkd.log on one of the search peers:

12-06-2016 15:32:53.654 +0800 ERROR BucketReplicator - Replication failed due to open failure: file=/opt/splunk/var/lib/splunk/_internaldb/db/db_1480966426_1480764958_87_BF4B1947-4FB6-4464-BD62-299457B51B72/1480941784-1480764958-4821280532600088189.tsidx error='No such file or directory'

2-06-2016 15:34:01.061 +0800 ERROR BucketReplicator - Replication failed due to open failure: file=/mnt/security/db_1481009551_1480929735_9_BF4B1947-4FB6-4464-BD62-299457B51B72/1481007010-1481003314-4825555822462713745.tsidx error='No such file or directory'

Seems some tsidx files are 'lost'. There are other tsidx files in the same bucket. I've no idea what happened. Would anyone please help? Thanks.

Besides, a cold bucket folder for heavy index looks like following :

indexer1:
db_1479686070_1479451778_0_BF4B1947-4FB6-4464-BD62-299457B51B72
db_1479873491_1479686071_1_BF4B1947-4FB6-4464-BD62-299457B51B72
rb_1478498103_1478252227_4_7DE7B2FF-7653-48F6-8C1B-4F611554920C
rb_1478568321_1478498104_5_7DE7B2FF-7653-48F6-8C1B-4F611554920C

indexer2:
db_1478498103_1478252227_4_7DE7B2FF-7653-48F6-8C1B-4F611554920C
db_1478568321_1478498104_5_7DE7B2FF-7653-48F6-8C1B-4F611554920C
rb_1479686070_1479451778_0_BF4B1947-4FB6-4464-BD62-299457B51B72
rb_1479873491_1479686071_1_BF4B1947-4FB6-4464-BD62-299457B51B72

Seems buckets are renamed to "rb*" when replicated to peer. Is that correct?
Sorry for the newbie questions.

Thanks a lot.
Regards,
/ST Wong

0 Karma
Highlighted

Re: How to resolve bucket replication errors such as "Replication failed due to open failure" occurring in search peer?

Splunk Employee
Splunk Employee

The first (quick) answer is that yes, buckets have their directory name to begin with "rb_" when they're replicated.

As for the missing TSIDX files, it may be possible to rebuild the bucket. From the CLI, you'd use something like splunk rebuild db_1479686070_1479451778_0_BF4B1947-4FB6-4464-BD62-299457B51B72. This builds the TSIDX files (and *.data files - metadata) from the raw data journal, all afresh.

0 Karma