Deployment Architecture

Is there a global setting for rolling of warm to cold and cold to frozen in an indexer cluster?

fatemabwudel
Path Finder

Hi,

So we are in the process of configuring the hot/warm and cold storage. The scenario is:
We have a cluster of 3 indexers, each having multiple indexes and one big volume of 5TB to store hot/warm, cold and archived buckets of all the indexes. We would like to partition the volume such that it stores around 671GB of Hot/warm data, around 2TB of cold data, and remaining archive. We came up with these numbers based on the retention period we want for each kind of bucket (i.e around 7 days for hot/warm, 25 days for cold and 5 months for archive with 200GB/day indexing ).

When I looked at configuring a global setting for all indexes to roll the buckets from one stage to another based on the sizes mentioned above, I couldn't find a way to do that.
Hence, would like to know if there is a global setting to roll the buckets from warm to cold and cold to frozen, when the hot/warm and cold db are on the same volume with multiple indexes. There are the per-index settings of homePath.maxDataSize and coldPath.maxDataSize which rolls the data based on the size of hot/warm db and cold db of that index and not the overall size of the hot/warm and cold storage on the volume.

Any help would be appreciated.

1 Solution

maciep
Champion

Indexes.conf does give you the ability apply settings globally in the default stanza. So if all of your data is going to roll to frozen after 25 days, then fronzentimeperiodinsecs can be set globally. So any new index will inherent that default setting w/o you having to set it manually in the new index stanza. And anything set in the default stanza can be overwritten by a setting in an index-specific stanza.

But although set globally, they are still per-index settings. I don't believe there is a way to have a bucket roll on just overall size of drive. Or total bucket count across all indexes.

View solution in original post

0 Karma

ddrillic
Ultra Champion

The documentation describes the options vividly - indexes.conf

fatemabwudel
Path Finder

Thanks ddrillic. The first time I glanced through the indexes.conf file, didn't notice the example provided at the very bottom.
The settings can be tweaked to solve the purpose of using a whole volume and segregating it based upon the total size we would want to give to hot/warm and cold storage. After looking at the example carefully I came up with this following sample of configuration that resolves my issue:

Defining one single volume that I have, with two different names pointing to the same single volume with two different sizes, hence all the indexes will store their hot_warm data in the max. space defined for that volume and once that limit is reached the buckets will start rolling to cold:

[volume:hot_warm]
path= /splunk/dataBase
maxVolumeDataSizeMB = 671000

[volume:cold]
path= /splunk/dataBase
maxVolumeDataSizeMB = 2000000

[id1]
homepath = volume:hot_warm/id1/db
coldpath = volume:cold/id1/colddb
maxWarmDBCount = 4294967295

[id2]
homepath = volume:hot_warm/id2/db
coldpath = volume:cold/id2/colddb
maxWarmDBCount = 4294967295

& so on.....

Thanks for all the answers and quick responses, appreciate it.

0 Karma

maciep
Champion

Indexes.conf does give you the ability apply settings globally in the default stanza. So if all of your data is going to roll to frozen after 25 days, then fronzentimeperiodinsecs can be set globally. So any new index will inherent that default setting w/o you having to set it manually in the new index stanza. And anything set in the default stanza can be overwritten by a setting in an index-specific stanza.

But although set globally, they are still per-index settings. I don't believe there is a way to have a bucket roll on just overall size of drive. Or total bucket count across all indexes.

0 Karma

fatemabwudel
Path Finder

Thanks Maciep for the response.
But then how will I control the rolling of warm to cold across the different indexers, so that I can have atleast 7 days worth of hot/warm data for each index and then it should roll to cold.
I can set the maxWarmDBCount for warm to cold but the problem is if some index if large as compared to others then the warm buckets of that index will fill up quickly and will roll to cold and there the buckets will stay for 25 days (based on the frozenTimePeriodInSecs), but the indexes that are small, i.e ingesting MBs of data per day, then for them the warm buckects will fill up pretty late and the data will stay in hot/warm DB may be beyond 7 days before rolling to cold, and there it will stay for another 25 days. Hence in conclusion for some indexes the searchable data ( hot/warm and cold) will be less, say max. 25 days (if the data immediately rolled to cold and stayed there for 25 days) and for some indexes, the searchable data will be more than 25 days, say maybe 40 days (if the data stays longer in hot/warm DB and then rolled to cold where it stays for 25 more days). This will cause our overall estimate of hot/warm and cold of almost 2.2TB (for 7+25 days) and remaining archive to not match up with our current storage policy,
Also, I have two more questions:
1. As I have only 1 big volume for all kinds of buckets, and it's 5TB, can I set maxVolumeDataSize to be 3TB so that at any point in time I can have atmost 3TB worth of hot/warm and cold data, as I read the documentation and it says "if a volume contains both warm and cold buckets (which will happen if an index's homePath and coldPath are both set to the same volume), the oldest bucket will be rolled to frozen." and then the buckets on the volume will get moved to archive dir on the same volume once the 3TB limit will be reached (the remaining 2TB) ?
2. The frozenTimePeriodInSecs counts the time beginning when the bucket rolled to cold or when the bucket was first created,i.e beginning of time from hot bucket. The point here is if it counts the overall time of the buckets then I can just set this attribute to 32 days worth of time (7+25), i.e the overall time data of an index will be searchable, so that I will have consistent searchable data across all indexes.

Thanks.

0 Karma
Get Updates on the Splunk Community!

Get Inspired! We’ve Got Validation that Your Hard Work is Paying Off

We love our Splunk Community and want you to feel inspired by all your hard work! Eric Fusilero, our VP of ...

What's New in Splunk Enterprise 9.4: Features to Power Your Digital Resilience

Hey Splunky People! We are excited to share the latest updates in Splunk Enterprise 9.4. In this release we ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...