Setup currently is LM, indexer, SH, and DS are all on the same host. I'm currently using Splunk Enterprise Version: 9.4. I get about 10 messages a second logged in the splunkd.log with the following error: ERROR BTree [1001653 IndexerTPoolWorker-3] - 0th child has invalid offset: indexsize=67942584 recordsize=166182200, (Internal)
ERROR BTreeCP [1001653 IndexerTPoolWorker-3] - addUpdate CheckValidException caught: BTree::Exception: Validation failed in checkpoint I have noticed the btree_index.dat and btree_records.dat in /opt/splunk/data/fishbucket/splunk_private_db are re-created every few seconds. From what I can tell, after they get get to a certain point, those files are copied into the corrupt directory and are deleted. It then starts all over. I have tried to shutdown splunk and copy snapshot files over, but when I restart splunk they are overwritten and we start the whole loop of files getting created and then copied to corrupt. I tried a repair on the data files with the following command: splunk cmd btprobe -d /opt/splunk/data/fishbucket/splunk_private_db -r which returned the following output no root in /opt/splunk/data/fishbucket/splunk_private_db/btree_index.dat with non-empty recordFile /opt/splunk/data/fishbucket/splunk_private_db/btree_records.dat
recovered key: 0xd3e9c1eb89bdbf3e | sptr=1207
Exception thrown: BTree::Exception: called debug on btree that isn't open! It is totally possible there is some corruption somewhere. We did have a filesystem issue a while back. I had to do a fsck and there were a few files that I removed. As far as data I can't seem to find out where the problem might be. In splunk search I appear to have incomplete data in the _internal index. I can't view licensing and Data Quality are empty and have no data. Do I have some corrupt data somewhere which is causing problems with my btree index data? How would I go about finding the cause of this problem?
... View more