We had a problem today where our filesystem filled up on indexers, caused by many bundles appearing suddenly. I'm not overly familiar with this functionality. How/when do bundles get sent to the indexers? Scheduled? Every search? I see many from the same search-heads, so how does it manage them? Inquiring minds want to know...
So, the search-head sends down the delta for every search? I saw continuous large bundles (not just deltas). Does Splunk manage these in any way? Look out for old files?
In my understanding, whether to send a knowledge bundle, full OR delta, depends upon what search is being executed, what knowledge objects are being used by it and if any other knowledge objects used have been updated since the last bundle replication. If you're seeing the large bundles size, I would bet large lookups would be the major contributor to that. You can copy and untar the latest bundle and see what big files are there causing large size. Look at this for details on filters that you apply to limit the bundle size.
The searchpeers directory retains up to five replicated bundles from each search head sending requests. If you delete them, they will be created again for the next search that needs that set of configurations.
Thanks. I'd still like to know how they get there. Do the search-heads send them down every time a search is run? Are they scheduled? Sorry for the level of detail, but I'm going to be asked these questions as a result of the outage.
Both Search Head and Search Peer maintain a checksum of the configuration available on Search Head and knowledge bundle previously sent to search peers (indexers). If the checksum is outdated, Search head will send updated bundle, whole OR delta, to search peer running the search. IMO, the check happens when the search peer is added OR a search is distributed on a search peer.
If your file system is filling up from bundles, you were likely seeing an extremely large $SPLUNK_HOME/var/run/searchpeers. A search head replicates and distributes its knowledge objects to its search peers in the bundles you see in var/run/searchpeers.
Knowledge objects include saved searches, event types, and other entities used in searching across indexes. The search head needs to distribute this material to its search peers so that they can properly execute queries on its behalf. Bundles typically contain a subset of files (configuration files and assets) from $SPLUNKHOME/etc/system, $SPLUNKHOME/etc/apps and $SPLUNK_HOME/etc/users. The process of distributing knowledge bundles means that peers by default receive nearly the entire contents of the search head's apps. If an app contains large binaries that do not need to be shared with the peers, that could also be a reason for the large bundle sizes.
You can read more specifically on those bundles here:
The best way to mitigate this issue is to reduce the bundle size on the search head itself. This is done with the replication blacklist (just deleting the bundles will only temporarily resolve disk usage problems, as they will get replicated again if they still exist on the SH). The blacklist allows you to limit what is sent to the search peers (indexers) in the knowledge bundle.
We have an entire documentation page on that here:
As mentioned earlier, most bin directories, jar and lookup files do not need to be replicated to search peers, and can be blacklisted in distsearch.conf. For example, on the search heads:
noBinDir = .../bin/*
jarAndLookups = (jar|lookups)
You can then stop Splunk on each indexer (one at a time) and remove the knowledge bundles in $SPLUNKHOME/var/run/searchpeers and then start Splunk (the entire contents of $SPLUNKHOME/var/run/searchpeers can be deleted). The search heads will redistribute the new (reduced size) knowledge bundles.
As an FYI, each indexer keeps 5 knowledge bundles per search head.