Archive

Create bloom filter after the fact

Communicator

I've been backfilling a year worth of logs, and just now realized that I didn't reconfigure maxBloomBackfillBucketAge, and none of these old logs have bloom filters, which is desperately necessary given the size of these logs. Is there any way I can create the bloom filters without having to blow these logs away and start from scratch?

0 Karma
1 Solution

Communicator

From index.conf docs:
http://docs.splunk.com/Documentation/Splunk/latest/Admin/indexesconf


maxBloomBackfillBucketAge = [smhd]
* If a (warm or cold) bucket is older than this, we shall not [re]create its blomfilter when we come across it
* Defaults to 30d.
* When set to 0, bloomfilters are never rebuilt

If you set this to a large number (e.g. 700d), and restart Splunk, it will automatically start recreating the bloom filters as part of the fsck process:

5-08-2012 09:54:33.066 -0500 INFO  ProcessTracker - (child_2__Fsck)  Fsck
- Rebuild --bloom-only bucket /opt/splunk/var/lib/splunk/proxy/db/db_1327467837_1327451635_11 took 2635.6 milliseconds 05-08-2012 09:55:05.173 -0500 INFO  ProcessTracker - (child_3__Fsck)  Fsck
- Rebuild --bloom-only bucket /opt/splunk/var/lib/splunk/proxy/db/db_1327451634_1327435722_10 took 3.535 seconds 05-08-2012 09:55:19.568 -0500 INFO  ProcessTracker - (child_4__Fsck)  Fsck
- Rebuild --bloom-only bucket /opt/splunk/var/lib/splunk/proxy/db/db_1327435721_1327426983_9 took 3.306 seconds

View solution in original post

Champion

Probably an easier way to do this without editing configs would be to run the fsck rebuild process manually as per;

http://docs.splunk.com/Documentation/Splunk/4.3.2/admin/HowSplunkstoresindexes#Troubleshootyourbuckets

Communicator

From index.conf docs:
http://docs.splunk.com/Documentation/Splunk/latest/Admin/indexesconf


maxBloomBackfillBucketAge = [smhd]
* If a (warm or cold) bucket is older than this, we shall not [re]create its blomfilter when we come across it
* Defaults to 30d.
* When set to 0, bloomfilters are never rebuilt

If you set this to a large number (e.g. 700d), and restart Splunk, it will automatically start recreating the bloom filters as part of the fsck process:

5-08-2012 09:54:33.066 -0500 INFO  ProcessTracker - (child_2__Fsck)  Fsck
- Rebuild --bloom-only bucket /opt/splunk/var/lib/splunk/proxy/db/db_1327467837_1327451635_11 took 2635.6 milliseconds 05-08-2012 09:55:05.173 -0500 INFO  ProcessTracker - (child_3__Fsck)  Fsck
- Rebuild --bloom-only bucket /opt/splunk/var/lib/splunk/proxy/db/db_1327451634_1327435722_10 took 3.535 seconds 05-08-2012 09:55:19.568 -0500 INFO  ProcessTracker - (child_4__Fsck)  Fsck
- Rebuild --bloom-only bucket /opt/splunk/var/lib/splunk/proxy/db/db_1327435721_1327426983_9 took 3.306 seconds

View solution in original post