Splunk Search

How many indexes i can create in a splunk instance?

Motivator

How many number of indexes i can create in an indexer ?

Is there any disadvantages , on too many indexes ?

Keeping all the logs in a single index - would result in performance | slow in processing the search request ?

Tags (3)

Ultra Champion

Eeh, I think that the metadata regarding which sourcetypes exist where is on a more granular level - in the index buckets. That is where you find files like Hosts.data, Sources.data and SourceTypes.data.

From that I assume that you'd need to open every bucket in every index in order to determine if the sourcetype is present in that index.

Not efficient.

However, if you can instruct your users to type index=blaha instead of sourcetype=blaha, you would get a performance boost. How much will depend on other factors - one of the most important will be the time range for the query.

Still I would no recommend you to do this, since you'd have to micromanage all the disk usage for those indexes. Unless you have a really strong use case regarding the access restrictions.

/K

Legend

You can create as many as you want, however more indexes do not mean better performance. If you keep your data in many different indexes it's rather the opposite, as if you don't specify a specific index in your search Splunk will need to open each index to check if events that you're searching for are in there.

Dividing up data across several indexes is not something you do for performance reasons, rather it's something you do if you want either different periods for how long data will be kept, or different access permissions (for instance user A is allowed to access index X but not index Y, whereas user B is allowed to access index Y but not index X).

Motivator

Ok, let's say we want to add an index for each sourcetype (a 1 to 1 index/sourcetype ratio) to allow for the most granular security rules for access to events.

Let's assume I define a maximum of 100 indexes.
Assuming a user always specifies a sourcetype in their searches, will Splunk still check each index for that sourcetype on every search?
Does it just check some metadata field for each index rather than search through all events in an index?
Is the performance impact significant?

0 Karma

Builder

Rob,

first keep in mind that you can also segregate data access using, besides indexes, also a search term restriction for each role.

Second, if you split your indexes to keep just one sourcetype, it's not enough specify "sourcetype=x", but for better performances, also add "index=X". otherwise, as Ayn already wrote, your search will cause Splunk to check in ANY other index if the terms are present in that index. And it takes some time as well.

Moreover, keep in mind the kind of searches you'll have to make. Are you just searching ONE sourcetype at time? No cross sourcetype correlation? In that case, again, having data split in several indexes is not the best case.

Happy Splunking!
Marco

0 Karma

Legend

This means that even if you specify the sourcetype, two factors will impact performance:

  • The time range you choose for your search, because longer timerange = more buckets.
  • The amount of indexes you choose for your search, because more indexes = more buckets.
0 Karma

Legend

Yes and no - Splunk will not initially check the actual raw events in the index, but it will check the lexicon which is a data structure within each bucket in an index. The lexicon holds metadata about the events the index contains, such as any fields set at index-time. Splunk will check the lexicon in each bucket that falls within the selected timerange for your search.

State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!