Getting Data In

Why does Splunk have multiple indexes?

Splunk Employee
Splunk Employee

With Splunk's normalizing timestamp-based event indexing capabilities combined with it's powerful search language and processing commands, one would think that all you need is one big main index.

So why is there more than one index and what are the reasons for creating additional indexes?

2 Solutions

Splunk Employee
Splunk Employee

Mulitple indexes are indicated usually for two reasons:

  • Physical data separation
    • This may be related to access control of data, but it is not necessary to use separate indexes to control access to data, although with current (v4.1) Splunk management capabilities, access control is easiest to configure with separate indexes.
  • Differential retention periods for different data sets
    • This includes summary indexing of different time densities, test indexes, as well as cases of some data having longer retention requirements than other (often extremely high-volume) data has shorter requirements.

Performance is not a typical consideration, and the effect of multiple indexes vs a single one for a given set of data varies greatly depending on the exact nature of the data and the exact queries or mix of queries to be performed against it.

View solution in original post

SplunkTrust
SplunkTrust

In addition to gkanapathy's answer, additional indexes seems to be part and parcel of how summary indexing works.

http://www.splunk.com/base/Documentation/4.1.1/Knowledge/Usesummaryindexing

View solution in original post

Splunk Employee
Splunk Employee

There are performance goals as well, sparse data (login errors) will be more performant when searched apart from bulk data (firewall rule traversals). There's administrative overhead in creating multiple indexes (you have to configure them) but when you will have a large amount of data of quite different volumes in high performance environments this can be worthwhile. This is the main reason that summary indexing goes to a new index (it could use the same one).

There are more obscure cases as well for performance, such as different segmentation per index, but ideally this is not necessary.

SplunkTrust
SplunkTrust

In addition to gkanapathy's answer, additional indexes seems to be part and parcel of how summary indexing works.

http://www.splunk.com/base/Documentation/4.1.1/Knowledge/Usesummaryindexing

View solution in original post

Splunk Employee
Splunk Employee

ha! bad Maverick, bad!

0 Karma

Splunk Employee
Splunk Employee

maverick is on vendetta against me, jrodman, and other Splunk employees on this site.

0 Karma

SplunkTrust
SplunkTrust

I have no idea why this was considered the best answer hah

0 Karma

Splunk Employee
Splunk Employee

Mulitple indexes are indicated usually for two reasons:

  • Physical data separation
    • This may be related to access control of data, but it is not necessary to use separate indexes to control access to data, although with current (v4.1) Splunk management capabilities, access control is easiest to configure with separate indexes.
  • Differential retention periods for different data sets
    • This includes summary indexing of different time densities, test indexes, as well as cases of some data having longer retention requirements than other (often extremely high-volume) data has shorter requirements.

Performance is not a typical consideration, and the effect of multiple indexes vs a single one for a given set of data varies greatly depending on the exact nature of the data and the exact queries or mix of queries to be performed against it.

View solution in original post