Archive

Index planning

Path Finder

My company is just starting it's deployment of Splunk and one of the pieces of advice I've heard repeatedly from existing Splunk users is that we shouldn't just throw all our logs in one index. What I'm struggling with though is, what's the best way or what are good rules of thumb when deciding how to split our data up?

For example, we have several in house applications that I'd like to set up comprehensive monitoring for (data from application logs, perfmon data, logs from RabbitMQ, etc). I'm thinking we could set up the indexes for this data grouped either by application:

Application 1 Index

  • Application 1 logs
  • Application 1 server perfmon data
  • Application 1 RabbitMQ logs

Application 2 Index

  • Application 2 logs
  • Application 2 server perfmon data
  • Application 2 RabbitMQ logs

Or we could set up the indexes by the type of data contained in them:

Application logs index

  • Application 1 logs
  • Application 2 logs

Server perfmon index

  • Application 1 server perfmon data
  • Application 2 server perfmon data

RabbitMQ index

  • Application 1 rabbitMQ logs
  • Application 2 rabbitMQ logs

Of these two setups, which one is generally considered to be the better option and why? Or is there some other index partitioning I should consider instead?

Tags (2)
1 Solution

Splunk Employee
Splunk Employee

You should probably base your indexes off of two criteria:

  1. Retention time. How long do you need to keep the data in this index.
  2. Security. If a group of users should be able to search only specific sets of data, then those sets of data need to be in their own index(es).

Other than that, either of the two methods you mentioned above would work. If it were me, and all other things being equal, I'd definitely go with method 2.

View solution in original post

SplunkTrust
SplunkTrust

One thing your should consider is about data access. If its OK for Application 2 users to access Application 1 data, your second approach will work just fine. If the data has to be secured from inter application user access, approach 1 would be good.

0 Karma

Splunk Employee
Splunk Employee

You should probably base your indexes off of two criteria:

  1. Retention time. How long do you need to keep the data in this index.
  2. Security. If a group of users should be able to search only specific sets of data, then those sets of data need to be in their own index(es).

Other than that, either of the two methods you mentioned above would work. If it were me, and all other things being equal, I'd definitely go with method 2.

View solution in original post

Splunk Employee
Splunk Employee

part 2 -

Retention time - it's likely that you'll want to keep certain data longer than others. This is an index-level setting. If you want perfmon data for 60 days, but application data for 180 days, you'll need to separate your indexes accordingly.

Splunk Employee
Splunk Employee

Sure. Two parts because it's a long answer -

Basically, it's the two criteria I mentioned; retention time and security.
Security - Say you have a group of sysadmins that need to be able to search server perfmon data, but don't need any access to any of the other stuff. With method 1, you couldn't give the sysadmins access to just the perfmon data and nothing else. With method 2, you'd simply limit the sysadmins role to only be able to search the perfmon index.

0 Karma

Path Finder

Can you elaborate on why you'd pick method 2 over method 1?

0 Karma