Getting Data In
Highlighted

What does Splunk do when one index in an indexer has reached maximum capacity?

Communicator

Hi all,

I'm currently having problem with the storage in one of my indexer. Here's the brief summary of my condition:

  • 1 Search Head instance
  • 3 Indexer instances
  • Several Universal Forwarders, configured to send data to all 3 indexers in load-balance mode

Among the indexes that I have in all indexers, one index (let's say "SMS") in Indexer B has already reached the maximum given bucket size. My question is as follows: If the forwarders keep sending the data in load-balance mode to all indexers, will the forwarders skip sending the data to index "SMS" in indexer B as the maximum capacity has been reached?

Thanks in advance.

Best Regards,

Vincent

0 Karma
Highlighted

Re: What does Splunk do when one index in an indexer has reached maximum capacity?

Ultra Champion

The following page speaks about it Managing Indexers and Clusters of Indexers

It says - To set the maximum index size on a per-index basis, use the maxTotalDataSizeMB attribute. When this limit is reached, buckets begin rolling to frozen.

View solution in original post

Highlighted

Re: What does Splunk do when one index in an indexer has reached maximum capacity?

Communicator

Noted. Thank you for the documentation.

0 Karma
Highlighted

Re: What does Splunk do when one index in an indexer has reached maximum capacity?

SplunkTrust
SplunkTrust

The data retention is set per index and per indexer basis, so the forwarder will keep sending data to all three indexers, indexer2 will delete old buckets to make room for new incoming data.

Highlighted

Re: What does Splunk do when one index in an indexer has reached maximum capacity?

Communicator

So this means there's a risk of data loss, is that correct? I'm quite confused as sometimes the forwarders will only send data to one indexer and ignore the rest even though in load-balance mode.

0 Karma
Highlighted

Re: What does Splunk do when one index in an indexer has reached maximum capacity?

Motivator

So the forwarder load balancing is a little interesting. A forwarder will switch targets on a regular interval. (Default 30 seconds. autoLBFrequency, set in outputs.conf) This means that at any given time, a forwarder is only sending to one indexer. It isn't round robin, instead regularly randomizing the indexer list.

However, it only makes the switch when it's considered 'safe' to do so, to avoid half of an event going to Indexer A, and the other half going to indexer B. This means EOF on a file read, and 10 seconds of inactivity on a TCP connection.

So if your forwarders aren't keeping up with file writes, it's possible for them to get 'stuck' on an indexer, and for that 30 second period to extend quite a bit.

To mitigate, you can set forceTimebasedAutoLB = true (again in Outputs.conf) but then you run into potential problems with events getting split. I wouldn't recommend this.

It's also worth noting that the forwarder doesn't know anything about the state of the indexer besides it being a valid target for data. It doesn't know if a particular index is full or not.

Highlighted

Re: What does Splunk do when one index in an indexer has reached maximum capacity?

Communicator

I see now. This whole time it's been a misunderstanding on my part.
Thanks for the explanation, emiller.

0 Karma
Highlighted

Re: What does Splunk do when one index in an indexer has reached maximum capacity?

Esteemed Legend

It migrates the oldest buckets to Frozen to make room for the new events (FIFO). It generates a log when this happens, you will get a log like this in _internal:

07-24-2014 01:30:51.609 +0200 INFO BucketMover - will attempt to freeze: candidate='/opt/splunk/var/lib/splunk/rest/db/db_#######_#######_#' because 
Highlighted

Re: What does Splunk do when one index in an indexer has reached maximum capacity?

Communicator

Is there any way we can configure the amount data removed to the freeze bucket? In my case, sometimes the data removed is too much and sometimes it is too little (from the oldest event I can see using search).

0 Karma
Highlighted

Re: What does Splunk do when one index in an indexer has reached maximum capacity?

Splunk Employee
Splunk Employee

no, the bucket is the smallest unit of storage.

On the long term, you can try to specify smaller hot buckets to avoid having too large ones (up to 10GB buckets by default), try 500MB to start.
but avoid having too small ones, because it has a performance impact (especially on a cluster)

0 Karma