My forwarders monitor several thousand oracle logs daily that rotate out at a high frequency. As such, my fishbucket index is growing at a steady pace. Currently it sits at 200MB+ on my forwarders. I understand that this is considered small, relatively speaking, but do to policies in place, i can't allow the splunk forwarder to take up this much space on the system it is sitting on. Is there a way to delete records out of the fishbucket and reclaim space? I am well aware that this could lead to reindexing. just an fyi.
If you're using a forwarder, you can run 'splunk clean eventdata' from $SPLUNK_HOME/bin and it'll reset the fishbucket as well as any other data you've collected. Since you're not indexing and are aware that it could lead to reindexing, I suppose this is a good option for you. As an aside, the issue of not being able to control the fishbucket size has been raised in SPL-56516 and should be addressed in a future release of the product.
To delete specific entries from btree, you can see this post:
If you're using UF, you cannot run "splunk clean eventdata' because UF's index database are disabled. You have to stop splunk and delete $SPLUNK_HOME/var/lib/fishbucket directory.
Note that cleaning a fishbucket delete all records which files were monitored how much. So, the UF start to monitor data from the first line in each log file you're monitoring. So, it is a challenge to avoid duplicate events. And, once duplicate events are indexed, it is another challenge to keep one of duplicated events and delete the rest.
In Splunk 6.0+, the btree/fishbucket files have a size ceiling that is maintained. If the fishbucket files grow over a configurable ceiling, they are moved from $SPLUNKHOME/var/lib/splunk/fishbucket/splunkprivatedb to $SPLUNKHOME/var/lib/splunk/fishbucket/splunkprivatedb/save. We then populate a new, empty btree upon request -- entries we actually use are copied from the 'save' version.
Ultimately this means that your size will be bounded to 2x the ceiling.
If you need to resolve a current problem where the file is very large (let's say 10GB), we will copy your current btree/fishbucket data to 'save', so the space will not be immediately improved. In this case you can resolve your space concerns in the following way:
At this point your disk usage for btree/fishbucket to be constrained to 2x the limit.
In 6.0.x we use the maxTotalDataSizeMB value for the [fishbucket] index to configure this limit. After 6.0.x+ (next major release) there will be a dedicated configuration in limits.conf for this purpose.