Presuming you already know what indexing buckets are...
A splunk hot bucket is changed into an invalid_hot bucket when Splunk detects that the metadata files (Sources.data/Hosts.data/SourceTypes.data) are corrupt/incorrect. There are two types of incorrect data detected: the time ranges may be incorrect, or the event counts may be incorrect. We believe that the time ranges are usually at fault.
An invalid hot bucket is mostly ignored by the index from this point on. Since we don't trust it, we don't want to put more data in it, and we do not search it. They are not currently (4.1.x) automatically recovered or automatically managed in any way.
Invalid hots do not count as hot or warm for the index management considerations (max number of allowed hot, max number of allowed warm buckets). Thus, they will not negatively affect the flow of data through the system, but at the time of this writing (4.1.3) they can incur additional disk storage over what is expected, because the normal data will be stored in additional to the invalid hot data.
In some cases this is inconsequential. There have been versions historically where splunk decided a bucket was invalid too early when it was still an empty directory. While a nuisance, there is no harm with that scenario. You can safely delete such an empty invalid hot (these were generated by versions around 4.0.3. If you are running such, please upgrade.)
In other cases, real data had already arrived in the hot bucket before it was determined to be problematic.
The corrective action for an invalid hot is to:
Attempt to rebuild the metadata from the rawdata information (recover-metadata)
If successful, rename the bucket as a warm bucket to rejoin the splunk index proper.
The recover-metadata command is destructive. It will clobber the existing .data files in a bucket. I recommend making a duplicate of these files before running recover-metadata even if their only use may be for forensics purposes.
To run recover-metadata, run `splunk cmd recover-metadata path/to/your/invalid_hot_5'. Hopefully, it tells you that it worked.
If recover-metadata is successful, rename the bucket as it would normally be named (link to script forthcoming) and all should be well.
... View more