Hi @richgalloway,
This is with respect to your solution posted in https://community.splunk.com/t5/Splunk-Search/Searchquery-error/m-p/509508. Since the thread is of 2020 and it is marked as resolved, I have created this new thread.
The issue is about error message observed in Splunk index=_internal: -
Failed to read size=1 event(s) from rawdata in bucket
Rawdata may be corrupt, see search.log. Results may be incomplete!
You shared if bucket prefix is "rb_", it is a replicated bucket and thus, we should stop the indexer, delete the bucket, then restart the indexer. The cluster master will create a new replicate bucket.
I need your inputs when prefix is: "db_", what does it stand for and what all actions to take for it?
Secondly, I also observed bucket prefix: - "hot_v1". Thus, would want to know what it stands for and what all actions to take for it?
Thirdly, you stated the specific file may be corrupt.
I need your inputs on below: -
1. How do I find if the file became corrupt or if the reason is different?
2. How do I find the details of the file if it got corrupt such as: -
2.1 From which forwarder the data was sent?
2.2 At what timestamp did the file become corrupt?
Thank you
Since the answer was written, I've learned more about bucket names. The "rb_" prefix means the bucket was a replicate when it was first created. However, it may now be the primary bucket if the original primary was lost (buckets are not renamed in that case).
The "db_" prefix is for primary buckets. Use the fsck command to repair it.
The "hot_" prefix is for hot buckets - those open for writing. Restart the indexer to roll the bucket to warm ("db_*") then use fsck.
As the error message says, the bucket may be corrupt - or maybe there's something else wrong. If the fsck command doesn't fix it then contact Splunk Support for assistance. They may be able to determine the reason for the failure.
Since we don't have access to the bucket file structure, there's no way to look inside to see what the problem is. Splunk Support will have to do that for you. It's unlikely, however, that they can tell you which forwarder sent the data (that's not recorded, by default) or when the corruption happened.
Keep in mind that the raw data in a warm or cold bucket is never written to so it never changes. However, buckets are collections of files and any of the supporting files in a bucket could change enough to prevent an indexer from reading data properly. Likewise, a file system error might keep an indexer from reading a bucket.
https://docs.splunk.com/Documentation/Splunk/9.0.4/Indexer/HowSplunkstoresindexes
I fetched that db is for Originating bucket and rb is for Replicated bucket.
Since the answer was written, I've learned more about bucket names. The "rb_" prefix means the bucket was a replicate when it was first created. However, it may now be the primary bucket if the original primary was lost (buckets are not renamed in that case).
The "db_" prefix is for primary buckets. Use the fsck command to repair it.
The "hot_" prefix is for hot buckets - those open for writing. Restart the indexer to roll the bucket to warm ("db_*") then use fsck.
As the error message says, the bucket may be corrupt - or maybe there's something else wrong. If the fsck command doesn't fix it then contact Splunk Support for assistance. They may be able to determine the reason for the failure.
Since we don't have access to the bucket file structure, there's no way to look inside to see what the problem is. Splunk Support will have to do that for you. It's unlikely, however, that they can tell you which forwarder sent the data (that's not recorded, by default) or when the corruption happened.
Keep in mind that the raw data in a warm or cold bucket is never written to so it never changes. However, buckets are collections of files and any of the supporting files in a bucket could change enough to prevent an indexer from reading data properly. Likewise, a file system error might keep an indexer from reading a bucket.
Hi @richgalloway,
Thank you for sharing your prompt and detailed inputs for all the questions shared in the content.
It would be very helpful, if you could also check out the below and help by sharing your details: -
https://community.splunk.com/t5/Monitoring-Splunk/How-to-fetch-details-of-corrupted-data/m-p/638721
Thank you
Hi @richgalloway,
Thank you for your prompt and detailed inputs for all the questions.