Getting Data In

Indexes Config: Relationship of maxTotalDataSizeMB to home and cold

Contributor

What is the industry practice for setting the home and cold sizes? If home + cold = maxTotalDataSizeMB, should we

a.)
home - 50%
cold - 50%

b.)
home - 30%
cold - 70%

c.)
home - 10%
cold - 90%

d.) others

Hoping to get enlightenment from the crowd. Thanks a lot!

0 Karma

Contributor

This is a very helpful tool: https://splunk-sizing.appspot.com/

0 Karma

SplunkTrust
SplunkTrust

Hi @morethanyell

If you have a look here : https://docs.splunk.com/Documentation/Splunk/7.2.4/Indexer/HowSplunkstoresindexes#Bucket_names

You will notice that warm and cold buckets are both the same. The only difference is where they are stored. This allows you to create storage tiering with different disk performance based on how old data is. This means you can have fast disks (possibly SSD) -> fast search results for warm data in the home path and slower disks -> slower search results for older data in the cold path.

Most users typically run searches over the last week, machine data older than that are rarely used for real time analytics. It's therefore recommended to keep at least a weeks data in home storage and the rest can go to cold storage. If you think your users will be expecting more than one week of "fast data" then adapt your home storage accordingly.

Let me know if you need more details.

Cheers,
David

Champion

Use it when there is a restriction on the hard disk, and you want to set the data to be searched quickly to the fast disk, and to set the data to not search much to the network drive. I do not have to worry if it is on the same disc. auto, auto_high_volume is good.

It is a reference link.
http://wiki.splunk.com/Community:UnderstandingBuckets

Contributor

Thank you. So, basically, there's no industry standard.

0 Karma