Presuming you already know what indexing buckets are...
A splunk hot bucket is changed into an invalid_hot bucket when Splunk detects that the metadata files (Sources.data/Hosts.data/SourceTypes.data) are corrupt/incorrect. There are two types of incorrect data detected: the time ranges may be incorrect, or the event counts may be incorrect. We believe that the time ranges are usually at fault.
An invalid hot bucket is mostly ignored by the index from this point on. Since we don't trust it, we don't want to put more data in it, and we do not search it. They are not currently (4.1.x) automatically recovered or automatically managed in any way.
Invalid hots do not count as hot or warm for the index management considerations (max number of allowed hot, max number of allowed warm buckets). Thus, they will not negatively affect the flow of data through the system, but at the time of this writing (4.1.3) they can incur additional disk storage over what is expected, because the normal data will be stored in additional to the invalid hot data.
In some cases this is inconsequential. There have been versions historically where splunk decided a bucket was invalid too early when it was still an empty directory. While a nuisance, there is no harm with that scenario. You can safely delete such an empty invalid hot (these were generated by versions around 4.0.3. If you are running such, please upgrade.)
In other cases, real data had already arrived in the hot bucket before it was determined to be problematic.
The corrective action for an invalid hot is to:
The recover-metadata command is destructive. It will clobber the existing .data files in a bucket. I recommend making a duplicate of these files before running recover-metadata even if their only use may be for forensics purposes.
To run recover-metadata, run `splunk cmd recover-metadata path/to/your/invalid_hot_5'. Hopefully, it tells you that it worked.
If recover-metadata is successful, rename the bucket as it would normally be named (link to script forthcoming) and all should be well.
For anyone interested. I wrote a little python script that will attempt to restore the proper case in your recovered metadata files by using your index-level metadata files. Splunk stores all of your metadata values in lower case in the index and only preservers the case your
.data files which get overwritten when you run
recover-metadata, this script tries to fix this. (BTW. I haven't tried this in 4.x, but it should work fine.) Here is the script: http://pastebin.ca/1481049
Mind if i pull this and stash it on the splunk.com wiki somewhere? I'm unclear why the case matters since I thought we smashed case for search purposes. Is it a display issue?