Solved: When is it best to configure indexing across index...

thomas_forbes · ‎09-23-2015

My current Splunk architecture plan is to have a combo search head and deployment server along with 2 indexers clustered together. What would be the best way to configure index storage? 1) Per unit basis, 2) Separate buckets based on temperature (hot, cold, etc), or 3) Across indexes, using volumes?

Also, is there a matrix of some sort that can indicate best practices when it comes to the actual size of the buckets? Based on our current plans we would fall into the small enterprise category.

Thank you very much in advance.

muebel · ‎10-04-2015

Generally you'll want to keep the warm/cold volumes on the same filesystem so that the warm -> cold rolling is efficient. You will only need to have multiple filesystems if you want to break out the cold into a cheaper sort of storage.

I would stick with the defaults for buckets size, unless you have a particular index where you will be doing more than 10GB per day, and then there is the auto high volume setting that can be used. Check out indexes.conf for more details. The documentation has more to say on configuring the volume locations.

View solution in original post

muebel · ‎10-04-2015

Generally you'll want to keep the warm/cold volumes on the same filesystem so that the warm -> cold rolling is efficient. You will only need to have multiple filesystems if you want to break out the cold into a cheaper sort of storage.

I would stick with the defaults for buckets size, unless you have a particular index where you will be doing more than 10GB per day, and then there is the auto high volume setting that can be used. Check out indexes.conf for more details. The documentation has more to say on configuring the volume locations.

somesoni2 · ‎09-23-2015

You already have Indexers in cluster, so just ensure all your forwarder, do load-balancing during sending data to indexers, so data is uniformly spread across both the indexers, regardless of bucket size, number of indexes. For best performance, use timebased load balancing.

See more here http://docs.splunk.com/Documentation/Splunk/6.2.5/Forwarding/Setuploadbalancingd

thomas_forbes · ‎09-23-2015

The clusters have not been set up yet. We are in the process of gathering resource requirements for our servers that will host 2 clustered indexers and 1 Search Head/Deployment Server. I am attempting to put together sizing requirements for memory and hard disk space and what is the best way to configure indexing.

Thank you,
Tom

somesoni2 · ‎09-23-2015

I would go through Splunk Installation manual (link below) for that.

http://docs.splunk.com/Documentation/Splunk/6.1.9/Installation/Beforeyouinstall

thomas_forbes · ‎09-23-2015

That doc did not really provide the information needed.

Back to the original question: What would be the best way to configure index storage? 1) Per unit basis, 2) Separate buckets based on temperature (hot, cold, etc), or 3) Across indexes, using volumes?

somesoni2 · ‎09-23-2015

I guess this should shed some light. (read the first para)

http://docs.splunk.com/Documentation/Splunk/6.2.5/Indexer/Usemultiplepartitionsforindexdata

Its recommend to use a single high performance file system to hold your index data for the best experience, but again they are costly. And if you've clarity on how frequently the historical/older data will be searched, your can have host/warm buckets in local/faster disk and cold db on slower/shared drives. Hope this helps in some way.

When is it best to configure indexing across indexes using volumes?

How to Monitor Google Kubernetes Engine (GKE)

Index This | How can you make 45 using only 4?

Splunk Education Goes to Washington | Splunk GovSummit 2024