I have 5 indexers,and one crushed with data partition damaged.As i know,forwarder data would be sent to these indexers randomly,(is this right ?) .So , can I restore data in the bad indexer possibly? Has SPLUNK a mechanism similar to the RAID5 ? Badly we don't have a backup index-data and even don't know which part of the data missed, please help me , thx a lot.
If you have a Splunk cluster, you should be able to just turn off the faulty indexer, and the cluster master should ensure the continued operation without any loss of data. Thus Splunk clusters can be viewed as a form of RAID (mirroring) on the host level, where the same data is stored in multiple locations (i.e. splunk servers).
In general, if you don't have a splunk cluster set up, and you can't repair the file system on the crashed host, you'd need to copy the undamaged buckets to another indexer, and possibly have to accept the data loss for those buckets that are damaged. Index buckets are quite unaware of where they belong, so the main part would be to ensure that there are no bucket collisions. Buckets have a naming structure like
db_timestamp1_timestamp2_serial, where the timestamps indicate the oldest/newest event within the bucket, and the serial number is just that.
However, when moving a bucket from one index to another, possibly on a different indexer, you need to ensure that the serial number is unique for that index. If there is a bucket in the new destination with the same serial number, you'd need to change either of them by renaming the bucket directory (only change the serial number part). If you have configured security features such as signing or event hashing, you should probably need to take this into account. Unfortunately, I cannot guide you there, and I suggest that you turn to support, or hope that someone more knowledgeable can provide the answer. The same probably applies if you have enabled accelerated searches.
As for forwarders, they will send data to all indexers they are configured for, rotating between them, one at a time, at 30 second intervals. So if you turn off your broken indexer (if it is still running and accepting incoming traffic), the forwarder will notice this and send its data to the remaining indexers.
Hope this helps,
For future operation, make sure your indexers are each using more robust HDD configurations such as RAID 1+0. That way you can survive a crashed disk without issue, you just need to replace it soon after failure to avoid multiple failures causing data loss.
On top of that, Splunk clustering as recommended by Kristian is a great way of improving availability on top of hardware RAID.