Solved: Re: What does Splunk do when one index in an index...

vincenteous · ‎05-18-2016

Hi all,

I'm currently having problem with the storage in one of my indexer. Here's the brief summary of my condition:

1 Search Head instance
3 Indexer instances
Several Universal Forwarders, configured to send data to all 3 indexers in load-balance mode

Among the indexes that I have in all indexers, one index (let's say "SMS") in Indexer B has already reached the maximum given bucket size. My question is as follows: If the forwarders keep sending the data in load-balance mode to all indexers, will the forwarders skip sending the data to index "SMS" in indexer B as the maximum capacity has been reached?

Thanks in advance.

Best Regards,

Vincent

ddrillic · ‎05-18-2016

The following page speaks about it Managing Indexers and Clusters of Indexers

It says - To set the maximum index size on a per-index basis, use the maxTotalDataSizeMB attribute. When this limit is reached, buckets begin rolling to frozen.

View solution in original post

woodcock · ‎05-18-2016

It migrates the oldest buckets to Frozen to make room for the new events (FIFO). It generates a log when this happens, you will get a log like this in _internal:

07-24-2014 01:30:51.609 +0200 INFO BucketMover - will attempt to freeze: candidate='/opt/splunk/var/lib/splunk/rest/db/db_#######_#######_#' because

vincenteous · ‎05-18-2016

Is there any way we can configure the amount data removed to the freeze bucket? In my case, sometimes the data removed is too much and sometimes it is too little (from the oldest event I can see using search).

yannK · ‎05-19-2016

no, the bucket is the smallest unit of storage.

On the long term, you can try to specify smaller hot buckets to avoid having too large ones (up to 10GB buckets by default), try 500MB to start.
but avoid having too small ones, because it has a performance impact (especially on a cluster)

vincenteous · ‎05-27-2016

Thanks for the explanation, yannK.
One more, is there any recommendation for the ratio between max index size and max size for hot bucket?

yannK · ‎05-27-2016

To avoid warnings, you may want to have maxhotbucketsize < maxtotoldatasizeMB.

But you also want the buckets to be large enough to avoid creating too many. (performance impact)

Try and see, it depends of your ingestion per day, and the range of your data.
the |dbinspect tool is useful to look at your buckets repartition.

http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Dbinspect

vincenteous · ‎05-30-2016

Noted yannK. Thanks for your help

somesoni2 · ‎05-18-2016

The data retention is set per index and per indexer basis, so the forwarder will keep sending data to all three indexers, indexer2 will delete old buckets to make room for new incoming data.

vincenteous · ‎05-18-2016

So this means there's a risk of data loss, is that correct? I'm quite confused as sometimes the forwarders will only send data to one indexer and ignore the rest even though in load-balance mode.

emiller42 · ‎05-27-2016

So the forwarder load balancing is a little interesting. A forwarder will switch targets on a regular interval. (Default 30 seconds. autoLBFrequency, set in outputs.conf) This means that at any given time, a forwarder is only sending to one indexer. It isn't round robin, instead regularly randomizing the indexer list.

However, it only makes the switch when it's considered 'safe' to do so, to avoid half of an event going to Indexer A, and the other half going to indexer B. This means EOF on a file read, and 10 seconds of inactivity on a TCP connection.

So if your forwarders aren't keeping up with file writes, it's possible for them to get 'stuck' on an indexer, and for that 30 second period to extend quite a bit.

To mitigate, you can set forceTimebasedAutoLB = true (again in Outputs.conf) but then you run into potential problems with events getting split. I wouldn't recommend this.

It's also worth noting that the forwarder doesn't know anything about the state of the indexer besides it being a valid target for data. It doesn't know if a particular index is full or not.

vincenteous · ‎05-30-2016

I see now. This whole time it's been a misunderstanding on my part.
Thanks for the explanation, emiller.

ddrillic · ‎05-18-2016

The following page speaks about it Managing Indexers and Clusters of Indexers

It says - To set the maximum index size on a per-index basis, use the maxTotalDataSizeMB attribute. When this limit is reached, buckets begin rolling to frozen.

vincenteous · ‎05-18-2016

Noted. Thank you for the documentation.

What does Splunk do when one index in an indexer has reached maximum capacity?

Building Reliable Asset and Identity Frameworks in Splunk ES

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

Automatic Discovery Part 3: Practical Use Cases

Are you a member of the Splunk Community?

What does Splunk do when one index in an indexer has reached maximum capacity?

Building Reliable Asset and Identity Frameworks in Splunk ES

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

Automatic Discovery Part 3: Practical Use Cases