Deployment Architecture

How to properly configure and monitor index retention in an indexer cluster?


We have moved to a new 3-Indexer environment with Index Replication from a 1-Indexer environment. We moved all of the buckets from the old environment to the 3 new Indexers, split up evenly in their cold bucket directories.

The new Indexers are configured with local SSD storage for hot/warm, and a DAS for cold storage.

I am having a difficult time understanding how to monitor the hot/warm storage vs cold storage. I want to be able to keep 30 days of data in our indexes on our SSD drives, and move everything else to cold.

For example, here is what I have configured for our netflow index in indexes.conf:

path = /logs
maxVolumeDataSizeMB = 700000

path = /daslogs
maxVolumeDataSizeMB = 22500000

maxDataSize = auto_high_volume
homePath = volume:HotWarm/netflow/db
homePath.maxDataSizeMB = 100000
coldPath = volume:Cold/netflow/colddb
coldPath.maxDataSizeMB = 300000
thawedPath = /daslogs/netflow/thaweddb
frozenTimePeriodInSecs = 7776000

For the stanza above, am I to configure this for what the local indexer sees? Or the total among the 3 indexers? It seems to behave as if it is configured for the local indexer, meaning once this indexer reaches its maxDataSizeMB limit, it starts moving buckets to cold and freezing buckets.

On each indexer, this is how much space the HotWarm takes up for netflow:

On each indexer, this is how much space the Cold takes up for netflow:

Here is the query I am using to monitor HotWarm:

| dbinspect index=netflow state=hot
| stats sum(sizeOnDiskMB) as HotSize
| appendcols [ | dbinspect index=netflow state=warm | stats sum(sizeOnDiskMB) as WarmSize]
| eval HotWarm = HotSize + WarmSize
| eval HotWarmTotal = HotWarm / 1024
| gauge HotWarmTotal 0 100 200 300

Result is 274.5G

Here is the query I am using to monitor Cold:
| dbinspect index=netflow state=cold
| stats sum(sizeOnDiskMB) as sizeOnDiskMB by state
| eval sizeOnDiskGB = sizeOnDiskMB / 1024
| gauge sizeOnDiskGB 0 300 600 900

Result is 878.6G

So, the queries I am using are adding all 3 indexers up. But the indexes.conf settings are setup to essentials divide by 3.
Is this how this should be done to monitor? When I run these queries, am I just to assume that the load is divided up by 3?

Sorry for the long post. Just trying to explain this thoroughly.


0 Karma


The indexes.conf file applies to an individual indexer or peer node, it has no knowledge of how many other members are in the cluster or how much data might be on each member.

Therefore your indexes.conf needs to be designed per indexer.

For example:
path = /logs
maxVolumeDataSizeMB = 700000

That is 683GB per indexer (700000/1024), not 683GB of hot data for the entire cluster.

You could narrow down your dbinspect to a single indexer, dividing by the number of indexers should roughly work however that is assuming the data is evenly balanced between your indexer cluster members.
Newer splunk versions allow data rebalancing to assist with this.

I use the query:
| tstats count WHERE index="*" by splunk_server _time span=10m | timechart span=10m sum(count) by splunk_server

I then visualize in an area graph with 100% stacked mode to see if the data is even among cluster members or not.

If it's not even then you might need to do some tweaking and also run a data rebalance.

0 Karma

New Member

I would love to see an answer to this question.

0 Karma
Get Updates on the Splunk Community!

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...

Getting Started with AIOps: Event Correlation Basics and Alert Storm Detection in ...

Getting Started with AIOps:Event Correlation Basics and Alert Storm Detection in Splunk IT Service ...

Register to Attend BSides SPL 2022 - It's all Happening October 18!

Join like-minded individuals for technical sessions on everything Splunk!  This is a community-led and run ...