Monitoring Splunk

Splunk index creation strategy: Is it better to have multiple smaller indexes or one large index?

jamesvz84
Communicator

My understanding is that having multiple smaller indexes is more performant than having one large index where everything goes into.

Is this understanding correct? Is it better to have 10 indexes, one for each particular vendor device, rather than have them all go into one index? You can assume all reports for each device will be independent of the rest (no queries spanning multiple devices). However, if there ever was a need to query multiple devices at once, would it be better to have them in the same index?

What is the recommended strategy here?

0 Karma

ChrisG
Splunk Employee
Splunk Employee

The answer, as with so many questions like this about Splunk Enterprise, is probably "It depends." In this case, if you are thinking primarily about performance, it depends what kinds of searches you plan to run on your various data sources, and what the data volume is from the device. Other considerations are about user access and retention policies.

There is some guidance about this in the Managing Indexers and Clusters of Indexers manual.

martin_mueller
SplunkTrust
SplunkTrust

Apart from the hard "retention" and "access" criteria, here's a possible performance-based criterion:

Say you have one very chatty sourcetype spewing out 100GB/day that usually gets searched upon over the last couple of days, and a not-so-chatty sourcetype only doing 100MB/day but that usually gets searched upon over several months.
Both are accessible by the same roles in Splunk and both should be kept around for a year so you have no must-split reason.
However, the searches on the not-so-chatty sourcetype will be impeded in their performance by being mixed with the other sourcetype simply because the usage pattern is differing. Your data is forced to be split over thousands of buckets, with only a few megs per bucket relevant to your search. Splitting that small sourcetype off into a second index would significantly improve this - by how much is hard to say though.

mikebd
Path Finder

An additional benefit of discrete indexes is that they preserve the option of provisioning discrete higher performance storage as necessary. This can optimize storage provisioning costs when search activity patterns cause relatively higher disk i/o latency for indexes with relatively smaller data storage size requirements.

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!