Getting Data In

How to reduce Splunk fishbucket size?

damucka
Builder

Hello,

We have an issue with the size allocated by the UF on the clients. It touches 2 GB out of the 4 GB available for /opt in our environments and triggers alerts. The size distribution looks as follows:

ls5924:/opt/splunkforwarder/var/lib/splunk/fishbucket/splunk_private_db # du . -xh -d1 | sort -h
173M ./snapshot
1001M ./save
1.4G .

ls5924:/opt/splunkforwarder/var/lib/splunk/fishbucket/splunk_private_db/save # du . -xh -d1 | sort -h
501M ./snapshot
1001M .
ls5924:/opt/splunkforwarder/var/lib/splunk/fishbucket/splunk_private_db/save # ls -lrth
total 501M
-rw------- 1 splunk splunk 146M Oct 29 20:05 btree_index.dat
-rw------- 1 splunk splunk 355M Oct 29 20:05 btree_records.dat
drwx------ 2 splunk splunk 4.0K Oct 29 20:05 snapshot

ls5924:/opt/splunkforwarder/var/lib/splunk/fishbucket/splunk_private_db/save/snapshot # ls -lrth
total 501M
-rw------- 1 splunk splunk 355M Oct 29 20:05 btree_records.dat
-rw------- 1 splunk splunk 10 Oct 29 20:05 snap.dat
-rw------- 1 splunk splunk 146M Oct 29 20:05 btree_index.dat

Now, I read it is possible to lower the fishbucket size using the parameter:

[inputproc]
file_tracking_db_threshold_mb = 500

I would like to lower it to 200, but before I would like to analyze the situation and check if the old entries that would be wiped out could possibly still have matching files in the directories. I would like to avoid reindexing them. The questions I would have are:
- is there any way to perform such an analysis? I mean is there any way to read the fishbucket files and tell how big they should be without a risk of reindexing the files?
- as I understand, the files moved to save directory are "only" a backcopies. Let us say I would like to take the risk and skip having them.
Is it possible? Is there any parameter to skip the creation of the backcopies?
Can I link this directory to dev null?

I know that ideally, I should increase the size for the fishbucket but given the number of clients I have this will be a bigger process/discussion at our side. Therefore I would like to start from lowering the fishbucket size.

Kind Regards,
Kamil

Labels (1)
Tags (1)

impurush
Contributor

I have the exact same question but I did not see any answer for this.
@damucka, by any chance, did you solve the issue?

0 Karma

gjanders
SplunkTrust
SplunkTrust

As per the initial question you can lower the size of the fishbucket but I have not heard of a way to look at the contents.

The btprobe command can work with the fishbucket on a per file basis.

I don't think removing the backup files is a good idea, I believe they are part of  the function of the forwarder 

0 Karma

impurush
Contributor

Hi, @gjanders,

Yes, you are right, however, if we reduce the fish bucket size, then there is a high chance that the files get reindexed if the file is still in the server. That is the reason, I want to see the contents of the file. Also, we have identical servers and the fish bucket size is too much on one server and it is very low on another server. That is why I am curious to know what is the contents of the fish bucket.

0 Karma

gjanders
SplunkTrust
SplunkTrust

You could add an idea for this https://ideas.splunk.com/ideas

Very old versions of Splunk used to allow you to query the fishbucket, I don't believe any modern version does.

However the data may be of limited use, it's mostly checksum's in there so that may limit the use of any output...

I am unaware of a way to show the contents of the fishbucket

Get Updates on the Splunk Community!

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...