Searches are failing ion Core SH cluster
* splunkd.log *
Unable to distribute to peer named myindexer00 at uri https://10.1.110.1:8089 because replication was unsuccessful. ReplicationStatus: Failed - Failure info: failed_because_BUNDLE_SIZE_EXCEEDS_MAX_SIZE. Verify connectivity to the search peer, that the search peer is up, and that an adequate level of system resources are available. See the Troubleshooting Manual for more information.
$ splnks@splk07 /opt/splunk/bin $ ./splunk cmd btool distsearch list --debug | grep maxBundleSize
/opt/splunk/etc/apps/hawkEye/default/distsearch.conf maxBundleSize = 3096
$splnks@splk07 /opt/splunk/var/run $ ls -alrth
drwx------. 12 splunk splunk 4.0K Jul 22 14:34 searchpeers
-rw-------. 1 splunk splunk 46 Jul 22 16:01 2A59DFC9-2832-4ABD-B8E1-BC991A6379D0-1595448064.bundle.info
-rw-------. 1 splunk splunk 3.0G Jul 22 16:01 2A59DFC9-2832-4ABD-B8E1-BC991A6379D0-1595448064.bundle
-rw-------. 1 splunk splunk 46 Jul 22 16:12 2A59DFC9-2832-4ABD-B8E1-BC991A6379D0-1595448738.bundle.info
-rw-------. 1 splunk splunk 2.9G Jul 22 16:12 2A59DFC9-2832-4ABD-B8E1-BC991A6379D0-1595448738.bundle
1) Increase maxBundleSize & max_content_length OR
2) reduce the bundlesize by blacklisting the large files by inspecting the bundle file
According to the log message it appears the bundle size exceeds the limit after http MIME encryption, there must have been some large files if you open up the bundles using 'tar tvfz bundle_name.bundle'
$ tar tvfz 2A59DFC9-2832-4ABD-B8E1-BC991A6379D0-1595448064.bundle | sort -k 5 -n
This will sort the files in size order so that you can find at the bottom what files are large, then you can blacklist them or check with your customer who owns the files.
* how to blacklist, once you find a large lookup, such as "apps/APPName/lookups/PatternOfLargeFiles.csv then it will be like below;
Conf file: for example, $SPLUNk_HOME/etc/system/local/distsearch.conf [replicationBlacklist] DoNotInclude = apps/APP100/lookups/PatternOfLargeFiles.csv |
In case you want to increase the max bundle size then
[httpServer] in server.conf, @all indexers and SH max_content_length = (bytes, Default: 2147483648 (2GB)) [replicationSettings] in all SH, distsearch.conf maxBundleSize = (MB) |
More information in the doc below;
https://docs.splunk.com/Documentation/Splunk/8.0.5/DistSearch/Limittheknowledgebundlesize
1) Increase maxBundleSize & max_content_length OR
2) reduce the bundlesize by blacklisting the large files by inspecting the bundle file
According to the log message it appears the bundle size exceeds the limit after http MIME encryption, there must have been some large files if you open up the bundles using 'tar tvfz bundle_name.bundle'
$ tar tvfz 2A59DFC9-2832-4ABD-B8E1-BC991A6379D0-1595448064.bundle | sort -k 5 -n
This will sort the files in size order so that you can find at the bottom what files are large, then you can blacklist them or check with your customer who owns the files.
* how to blacklist, once you find a large lookup, such as "apps/APPName/lookups/PatternOfLargeFiles.csv then it will be like below;
Conf file: for example, $SPLUNk_HOME/etc/system/local/distsearch.conf [replicationBlacklist] DoNotInclude = apps/APP100/lookups/PatternOfLargeFiles.csv |
In case you want to increase the max bundle size then
[httpServer] in server.conf, @all indexers and SH max_content_length = (bytes, Default: 2147483648 (2GB)) [replicationSettings] in all SH, distsearch.conf maxBundleSize = (MB) |
More information in the doc below;
https://docs.splunk.com/Documentation/Splunk/8.0.5/DistSearch/Limittheknowledgebundlesize
thanks for the write up. Its useful for anyone with search head replication issues.