Monitoring Splunk

Performance improvement by having multiple indexes?

Jason
Motivator

A client asks: is there any performance improvement by having multiple indexes?

I'm guessing that there would be, if you were in a high-dataflow environment and could set different indexes to separate sets of fast local disk. Otherwise no. Input appreciated!

1 Solution

gkanapathy
Splunk Employee
Splunk Employee

It depends very much on the data, how you are searching it, and exactly how it is split across indexes. There is no general answer that is always true. Different queries on the same data, or similar queries on slightly differently organized data will be either slower or faster.

The answer is also extremely affected by how the indexes themselves would be stored. If you are going to store all the indexes on the same physical disk, then you are not going to get any improvements in (for example) needle-in-haystack searches over all indexes. If on the other hand, additional indexes are stored on separate physical disks, then you will have improvements, thought mostly due to the additional IO available. On the other hand, you might choose to simply take the same disks, stripe all the data across them, and put everything in a single index, in which case the performance impact will again come back to the particulars of your data and how you would have split it up.

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

It depends very much on the data, how you are searching it, and exactly how it is split across indexes. There is no general answer that is always true. Different queries on the same data, or similar queries on slightly differently organized data will be either slower or faster.

The answer is also extremely affected by how the indexes themselves would be stored. If you are going to store all the indexes on the same physical disk, then you are not going to get any improvements in (for example) needle-in-haystack searches over all indexes. If on the other hand, additional indexes are stored on separate physical disks, then you will have improvements, thought mostly due to the additional IO available. On the other hand, you might choose to simply take the same disks, stripe all the data across them, and put everything in a single index, in which case the performance impact will again come back to the particulars of your data and how you would have split it up.

gkanapathy
Splunk Employee
Splunk Employee

Note that you will almost certainly not be able to come close to overloading a single (direct-attached 10k RPM) disk with a single Splunk indexer instance during indexing. Disk performance tends to be an issue when searching. Slow storage (slow network-attached, slow cheap disks, slow RAID configurations, slow controllers) may cause indexing problems, but in that case worthwhile improvements are to go with hardware that we recommend.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...