Monitoring Splunk

Splunk index creation strategy: Is it better to have multiple smaller indexes or one large index?

jamesvz84
Communicator

My understanding is that having multiple smaller indexes is more performant than having one large index where everything goes into.

Is this understanding correct? Is it better to have 10 indexes, one for each particular vendor device, rather than have them all go into one index? You can assume all reports for each device will be independent of the rest (no queries spanning multiple devices). However, if there ever was a need to query multiple devices at once, would it be better to have them in the same index?

What is the recommended strategy here?

0 Karma

ChrisG
Splunk Employee
Splunk Employee

The answer, as with so many questions like this about Splunk Enterprise, is probably "It depends." In this case, if you are thinking primarily about performance, it depends what kinds of searches you plan to run on your various data sources, and what the data volume is from the device. Other considerations are about user access and retention policies.

There is some guidance about this in the Managing Indexers and Clusters of Indexers manual.

martin_mueller
SplunkTrust
SplunkTrust

Apart from the hard "retention" and "access" criteria, here's a possible performance-based criterion:

Say you have one very chatty sourcetype spewing out 100GB/day that usually gets searched upon over the last couple of days, and a not-so-chatty sourcetype only doing 100MB/day but that usually gets searched upon over several months.
Both are accessible by the same roles in Splunk and both should be kept around for a year so you have no must-split reason.
However, the searches on the not-so-chatty sourcetype will be impeded in their performance by being mixed with the other sourcetype simply because the usage pattern is differing. Your data is forced to be split over thousands of buckets, with only a few megs per bucket relevant to your search. Splitting that small sourcetype off into a second index would significantly improve this - by how much is hard to say though.

mikebd
Path Finder

An additional benefit of discrete indexes is that they preserve the option of provisioning discrete higher performance storage as necessary. This can optimize storage provisioning costs when search activity patterns cause relatively higher disk i/o latency for indexes with relatively smaller data storage size requirements.

0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...