One of my indexers crashed in a cluster environment and left a corrupted bucket. Search will return error when hitting that bucket like this:
[indexer1] idx=os Could not read event: cd=21145:261500. Results may be incomplete ! (logging only the first such error; enable DEBUG to see the rest)
Any command to remove/fix the corrupted bucket as I can't shutdown the indexer to run fsck right now?
From the message, the bucket number is 21145 (cd:21145:61500), you can run below search to locate the actual bucket.
| dbinspect index=os
| search bucketId = *21145*
| table bucketId, guId, splunk_server, index, state
Once you get the bucketId, run below REST API to remove it.
splunk _internal call /services/cluster/master/buckets/<bucketId>/remove_from_peer -method POST -post:peer <guId>
From the message, the bucket number is 21145 (cd:21145:61500), you can run below search to locate the actual bucket.
| dbinspect index=os
| search bucketId = *21145*
| table bucketId, guId, splunk_server, index, state
Once you get the bucketId, run below REST API to remove it.
splunk _internal call /services/cluster/master/buckets/<bucketId>/remove_from_peer -method POST -post:peer <guId>
How about a non-clustered bucket? Can I just delete it from the OS?