Getting Data In
Highlighted

Why is fishbucket getting really big on my Universal Forwarder?

Champion

I have allocated 2 GB of space for splunk universal forwarder -- the fishbucket is consuming 1.6 GB of that space. What is the expected size of the fishbucket and is there any way to tune the size?

Why is fishbucket getting so large?

Highlighted

Re: Why is fishbucket getting really big on my Universal Forwarder?

SplunkTrust
SplunkTrust

That's a big bucket of fish.

alt text

Sorry, nothing of value to add 😛

Highlighted

Re: Why is fishbucket getting really big on my Universal Forwarder?

Path Finder

I am having a similar situation and splunk support says that your UF have at least 5GB of space for it. Probably not what you wanted to hear (wasn't for me since that translates to about 10TB of extra storage across our env). Also if you have the nmon app installed we found that it was contributing to the fishbucket's rapid growth.

0 Karma
Highlighted

Re: Why is fishbucket getting really big on my Universal Forwarder?

Path Finder

I've had many forwarders fill up the 2GB file system set aside for them even though the trackingdbthreshold_mb setting is the default of 500.

Why would that be happening?

0 Karma
Highlighted

Re: Why is fishbucket getting really big on my Universal Forwarder?

SplunkTrust
SplunkTrust

$ du -sh /opt/splunkforwarder/*

I'm guessing a big chunk of that should be logs.

0 Karma
Highlighted

Re: Why is fishbucket getting really big on my Universal Forwarder?

Splunk Employee
Splunk Employee

It's like thermodynamics, the fishbucket/btree is the entropy of your file system. It can only grow with the time.

On old version it was not limited, on less old version is was controlled by indexes.conf hot bucket size
Hopefully since splunk 6.*. you can setup a limit on the size in limits.conf
then the checksum for files that are not present anymore will be removed.
The default limit is to 500MB.

see :
[inputproc]
filetrackingdbthresholdmb = 500

http://docs.splunk.com/Documentation/Splunk/latest/Admin/Limitsconf

Remark : as we maintain a backup copy, the disk space used is actually 2 times the limit, and sometimes 3 times the limit when a temporary file if generated for the new backup.

Highlighted

Re: Why is fishbucket getting really big on my Universal Forwarder?

Path Finder

So continuing on this thread (sorry it's old but I feel this is still relevant):
On a Splunk Universal Forwarder: it's local limits.conf file has the following lines to control the Fish Bucket size:
[inputproc]
filetrackingdbthresholdmb = 500

However weirdly in the Fishbucket folder 1.4gb is taken up by splunkprivatedb, over which ~1gb of that is consumed by the save folder.

I've read that the total space used on disk with the size limit about can be double or even triple that limit because it will have a backup and maybe also some associated temporary files.

Is there a way to reduce this disk usage via the Splunk Forwarder's config? Or is the manually removing some of the files the only way?
And is this within normal expectations of the Splunk Forwarders behaviour?

0 Karma
Highlighted

Re: Why is fishbucket getting really big on my Universal Forwarder?

SplunkTrust
SplunkTrust

Three times is the upper limit (from my understanding) as there is the snapshot copy of the fishbucket and a temporary copy.

You can lower:

[inputproc]
file_tracking_db_threshold_mb = 500

However you must be aware that lowering this value may result in files been re-indexed if they are still on the filesystem (500MB is a very large fishbucket so the chance of this happening is super-low). For example if the fishbucket had tracked 2 million files, and you still had 2 million files on the filesystem, if you reduced the fishbucket size to only track 1 million, then the remaining 1 million will be re-indexed again.

I've lowered it before and had minimal issue except on servers with hundreds (or thousands) of old files, after a few months the files would be re-indexed as they were the oldest entries in the fishbucket which were removed and then the file got re-indexed as they were still in the monitored directory path.