I'm currently having problem with the storage in one of my indexer. Here's the brief summary of my condition:
Among the indexes that I have in all indexers, one index (let's say "SMS") in Indexer B has already reached the maximum given bucket size. My question is as follows: If the forwarders keep sending the data in load-balance mode to all indexers, will the forwarders skip sending the data to index "SMS" in indexer B as the maximum capacity has been reached?
Thanks in advance.
The following page speaks about it Managing Indexers and Clusters of Indexers
It says - To set the maximum index size on a per-index basis, use the maxTotalDataSizeMB attribute. When this limit is reached, buckets begin rolling to frozen.
The data retention is set per index and per indexer basis, so the forwarder will keep sending data to all three indexers, indexer2 will delete old buckets to make room for new incoming data.
So this means there's a risk of data loss, is that correct? I'm quite confused as sometimes the forwarders will only send data to one indexer and ignore the rest even though in load-balance mode.
So the forwarder load balancing is a little interesting. A forwarder will switch targets on a regular interval. (Default 30 seconds. autoLBFrequency, set in outputs.conf) This means that at any given time, a forwarder is only sending to one indexer. It isn't round robin, instead regularly randomizing the indexer list.
However, it only makes the switch when it's considered 'safe' to do so, to avoid half of an event going to Indexer A, and the other half going to indexer B. This means EOF on a file read, and 10 seconds of inactivity on a TCP connection.
So if your forwarders aren't keeping up with file writes, it's possible for them to get 'stuck' on an indexer, and for that 30 second period to extend quite a bit.
To mitigate, you can set forceTimebasedAutoLB = true (again in Outputs.conf) but then you run into potential problems with events getting split. I wouldn't recommend this.
It's also worth noting that the forwarder doesn't know anything about the state of the indexer besides it being a valid target for data. It doesn't know if a particular index is full or not.
I see now. This whole time it's been a misunderstanding on my part.
Thanks for the explanation, emiller.
It migrates the oldest buckets to Frozen to make room for the new events (FIFO). It generates a log when this happens, you will get a log like this in
07-24-2014 01:30:51.609 +0200 INFO BucketMover - will attempt to freeze: candidate='/opt/splunk/var/lib/splunk/rest/db/db_#######_#######_#' because
Is there any way we can configure the amount data removed to the freeze bucket? In my case, sometimes the data removed is too much and sometimes it is too little (from the oldest event I can see using search).
no, the bucket is the smallest unit of storage.
On the long term, you can try to specify smaller hot buckets to avoid having too large ones (up to 10GB buckets by default), try 500MB to start.
but avoid having too small ones, because it has a performance impact (especially on a cluster)