Splunk Search

Significant search performance hit using multiple indexes

andywins
Explorer

I'm seeing three seconds of latency introduced to each search when using ~3,500 indexes. Here's the scenario:

  • ~3,000,000 events for source "X", all stored within the main default index
  • No data in any of the other custom ~3,500 indexes
  • Free license
  • 32g/ram, i7 series 4, 1tb/ssd
  • I've not violated the free tier indexing threshold

Using the admin account, a search as simple as "sourcetype=x" seems to wait roughly three seconds before beginning to fetch results. When I remove all the custom indexes, things fly like normal. Considering I'm not storing data in the extra indexes yet, I wouldn't expect a noticeable performance impact. As I create additional indexes, performance seems to drop.

What could be causing this? This link suggests it's not the number of indexes that count, it's the data. My experience shows the inverse. What is the maximum number of indexes per instance before running into issues?

1 Solution

alacercogitatus
SplunkTrust
SplunkTrust

I'd wager that it is still tied to the number of indexes. Even if those indexes don't contain data, each bloomfilter on each index bucket must be checked for matches. So even though its a quick search of the bloomfilter, you are still performing it ~3500 * #BucketsInIndex times. Try doing your search like this and see if it speeds up:

index=main sourcetype=x| blah blah

http://docs.splunk.com/Splexicon:Bloomfilter

View solution in original post

alacercogitatus
SplunkTrust
SplunkTrust

I'd wager that it is still tied to the number of indexes. Even if those indexes don't contain data, each bloomfilter on each index bucket must be checked for matches. So even though its a quick search of the bloomfilter, you are still performing it ~3500 * #BucketsInIndex times. Try doing your search like this and see if it speeds up:

index=main sourcetype=x| blah blah

http://docs.splunk.com/Splexicon:Bloomfilter

andywins
Explorer

Indeed, limiting to explicit indexes increases performance. Thanks alacercogitatus

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

Just a reminder: please accept the answer if we have answered your question. Thanks!

0 Karma

emiller42
Motivator

I would recommend one index per customer. You can use sourcetypes to differentiate the datafeeds.

0 Karma

Ayn
Legend

As long as you can single out the indexes you need for each query this won't be as big a problem since in that case Splunk knows immediately which indexes to open. If you do need to run loads and loads of searches over ALL indexes that could be more problematic.

0 Karma

andywins
Explorer

Thanks for the comments. I'll be bulk loading some data this week to vet the suggested approach. I'm using two Samsung 840 pro 500gb ssd's, striped. gkanapathy, we have this many customers and I want to partition their data, both for speed and security. Right now I can create roles/users locked down to a specific index. Ideally, I would have an index for each datafeed per customer (15 feeds * 3500 clients = 52,500 indexes). Let me know if you can think of a better approach.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Uh, 3500 indexes is way above the expected parameters. I would hesitate to use more than 200. Frankly, I would be happy to only see 3 seconds of latency on searches with that many indexes, but I suspect that's because you have a good SSD. Every index of course means more places to look for every search (even if it's empty, it's impossible to know it's empty w/o looking) as well as overhead checking for whether its full, etc.

What are you using 3500 indexes for? I wonder whether you need that or whether you can just put them into fewer.

linu1988
Champion

Sorry i should have been specific, I meant the dashboards which are having proper search queries.

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

But if you don't explicitly declare in your search which index, the search still hits every index location, and depending on your disk speeds, might introduce some latency.

0 Karma

linu1988
Champion

But when i added another 20 indexes to existing 20 i felt the searches had slowed down even if the new indexes didn't contain any data. Any possible reason?

0 Karma

linu1988
Champion

I wouldn't do this many index but curious to know the reason as i see something similar with me.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...