Recently, I observed error messages on my search head like "Unable to distribute to peer named XXX at URI https://xx:8089 because replication was unsuccessful. replicationStatus Failed failure info: Dispatch Command: Search Bundle throttling is occuring because the limit for number of bundles with pending lookups for indexing has been exceeded. This could be the result of large lookup files updating faster than Splunk software can index them. Throttling ends when this instance has caught up with indexing of lookups"
On investigating, I did found that a lot of our lookups were over 100+ MB, going upto 500+ MB were in the bundle. I proceeded to identify the large lookups and created a replicationBlacklist for them, which I plan to implement on my search head in distsearch.conf.
My question is, is it good to delete all the .bundle files from $SPLUNK_HOME/var/run directory, after implementing the above mentioned change and then restart Splunkd? Some bundles are almost a year old. What will be the impact of this, or is there anything I should take care of before doing this or is there an alternative?
Splunk keeps last 5 bundles I believe, you are safe to remove more than 5 bundles. you can delete all bundles but I don't recommend doing that. you will get so many inconsistent errors. once you have updated blacklist verify that lookups are blacklisted or not by just checking bundle size in var/run if you see bundle size reduced wait for 10 mins max you should see new bundle pushed to search peers (indexers).
Thanks for the guidance. Since there are chances of errors of inconsistency, if I do not remove any of the previous search bundles, blacklist the lookups, and restart the service, will it cause Splunk to create a new knowledge bundle, or do I have to get rid of the previous knowledge bundle?
Also, do you think the way my blacklist is setup, will work?
Since it'll be done on a production environment, I don't want to cause any troubles 🙂