Monitoring Splunk

Performance improvement by having multiple indexes?

Jason
Motivator

A client asks: is there any performance improvement by having multiple indexes?

I'm guessing that there would be, if you were in a high-dataflow environment and could set different indexes to separate sets of fast local disk. Otherwise no. Input appreciated!

1 Solution

gkanapathy
Splunk Employee
Splunk Employee

It depends very much on the data, how you are searching it, and exactly how it is split across indexes. There is no general answer that is always true. Different queries on the same data, or similar queries on slightly differently organized data will be either slower or faster.

The answer is also extremely affected by how the indexes themselves would be stored. If you are going to store all the indexes on the same physical disk, then you are not going to get any improvements in (for example) needle-in-haystack searches over all indexes. If on the other hand, additional indexes are stored on separate physical disks, then you will have improvements, thought mostly due to the additional IO available. On the other hand, you might choose to simply take the same disks, stripe all the data across them, and put everything in a single index, in which case the performance impact will again come back to the particulars of your data and how you would have split it up.

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

It depends very much on the data, how you are searching it, and exactly how it is split across indexes. There is no general answer that is always true. Different queries on the same data, or similar queries on slightly differently organized data will be either slower or faster.

The answer is also extremely affected by how the indexes themselves would be stored. If you are going to store all the indexes on the same physical disk, then you are not going to get any improvements in (for example) needle-in-haystack searches over all indexes. If on the other hand, additional indexes are stored on separate physical disks, then you will have improvements, thought mostly due to the additional IO available. On the other hand, you might choose to simply take the same disks, stripe all the data across them, and put everything in a single index, in which case the performance impact will again come back to the particulars of your data and how you would have split it up.

gkanapathy
Splunk Employee
Splunk Employee

Note that you will almost certainly not be able to come close to overloading a single (direct-attached 10k RPM) disk with a single Splunk indexer instance during indexing. Disk performance tends to be an issue when searching. Slow storage (slow network-attached, slow cheap disks, slow RAID configurations, slow controllers) may cause indexing problems, but in that case worthwhile improvements are to go with hardware that we recommend.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...