Solved: Performance improvement by having multiple indexes...

Jason · ‎06-22-2010

A client asks: is there any performance improvement by having multiple indexes?

I'm guessing that there would be, if you were in a high-dataflow environment and could set different indexes to separate sets of fast local disk. Otherwise no. Input appreciated!

gkanapathy · ‎06-22-2010

It depends very much on the data, how you are searching it, and exactly how it is split across indexes. There is no general answer that is always true. Different queries on the same data, or similar queries on slightly differently organized data will be either slower or faster.

The answer is also extremely affected by how the indexes themselves would be stored. If you are going to store all the indexes on the same physical disk, then you are not going to get any improvements in (for example) needle-in-haystack searches over all indexes. If on the other hand, additional indexes are stored on separate physical disks, then you will have improvements, thought mostly due to the additional IO available. On the other hand, you might choose to simply take the same disks, stripe all the data across them, and put everything in a single index, in which case the performance impact will again come back to the particulars of your data and how you would have split it up.

View solution in original post

gkanapathy · ‎06-22-2010

It depends very much on the data, how you are searching it, and exactly how it is split across indexes. There is no general answer that is always true. Different queries on the same data, or similar queries on slightly differently organized data will be either slower or faster.

The answer is also extremely affected by how the indexes themselves would be stored. If you are going to store all the indexes on the same physical disk, then you are not going to get any improvements in (for example) needle-in-haystack searches over all indexes. If on the other hand, additional indexes are stored on separate physical disks, then you will have improvements, thought mostly due to the additional IO available. On the other hand, you might choose to simply take the same disks, stripe all the data across them, and put everything in a single index, in which case the performance impact will again come back to the particulars of your data and how you would have split it up.

gkanapathy · ‎06-22-2010

Note that you will almost certainly not be able to come close to overloading a single (direct-attached 10k RPM) disk with a single Splunk indexer instance during indexing. Disk performance tends to be an issue when searching. Slow storage (slow network-attached, slow cheap disks, slow RAID configurations, slow controllers) may cause indexing problems, but in that case worthwhile improvements are to go with hardware that we recommend.

Performance improvement by having multiple indexes?

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics!

New in Observability Cloud - Explicit Bucket Histograms