Hi,
Currently, I keep 18 months of logs, and I'm spending a lot of resources in storage (AWS EBS), so until we find a solution to send and get frozen data using AWS S3, I decided to change the replication factor from 3 to 2, once an indexer failure is something really rare.
My issue is that I set this config and restarted my Cluster Master, as requested, but the available disk space on my indexer instances didn't change.
How can I purge unnecessary replied data?
As you've discovered, the excess buckets are not removed automatically. It's a manual process using the CLI.
See more in this related question https://answers.splunk.com/answers/369399/how-to-reduce-the-replication-factor-in-a-multisit.html
It's interesting to look at /opt/splunk/etc/system/default
-
site_replication_factor = origin:2, total:3
site_search_factor = origin:1, total:2
The following says Configure the site replication factor
site_replication_factor = origin:<n>, [site1:<n>,] [site2:<n>,] ..., total:<n>
• n is a positive integer indicating the number of copies of a bucket.
• origin: specifies the minimum number of copies of a bucket that will be held on the site originating the data in that bucket (that is, the site where the data first entered the cluster). When a site is originating the data, it is known as the "origin" site.
• site1:, site2:, ..., indicates the minimum number of copies that will be held at each specified site. The identifiers "site1", "site2", and so on, are the same as the site attribute values specified on the peer nodes.
• total: specifies the total number of copies of each bucket, across all sites in the cluster.
So, by default the replication factor is 3 -- if I read it correctly ; -) many of us, along the way, decide to go lower...
Am I wrong or $SPLUNK_HOME/etc/system/local/server.conf has preference over $SPLUNK_HOME/etc/system/default/server.conf.
In my local server.conf the repfactor is set as I want
[clustering]
cluster_label = master1
mode = master
pass4SymmKey = $1$RmUoN98$
replication_factor = 2
rebalance_threshold = 0.9
search_factor = 2
max_peer_build_load = 2
max_peer_rep_load = 20
max_peer_sum_rep_load = 20
maintenance_mode = false
When I posted this question, yesterday, my total indexed data was about 97TB, now is 82TB, so I think Splunk might have an auto purge, but It works without hurry.
Thanks anyway.
Refer to Summary of directory precedence on the Configuration file precedence page or use btool, but yes system/local overrides system/default
As you've discovered, the excess buckets are not removed automatically. It's a manual process using the CLI.
See more in this related question https://answers.splunk.com/answers/369399/how-to-reduce-the-replication-factor-in-a-multisit.html
In addition to richgalloway's answer refer to Remove excess bucket copies from the indexer cluster in modern Splunk versions you can remove the buckets from the GUI
btw, my splunkweb version is 7.1.2
Thank you both, This Remove excess buckets is what I need, but I must say that I use to index around 300gb/day, my TTL is 18 months and I didn't expire any log since I started to use this TTL, So if my total indexed data decreased by more than 10% from yesterday to now, I can only believe that Splunk might have some worker who fits the buckets into the config.