- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My company is just starting it's deployment of Splunk and one of the pieces of advice I've heard repeatedly from existing Splunk users is that we shouldn't just throw all our logs in one index. What I'm struggling with though is, what's the best way or what are good rules of thumb when deciding how to split our data up?
For example, we have several in house applications that I'd like to set up comprehensive monitoring for (data from application logs, perfmon data, logs from RabbitMQ, etc). I'm thinking we could set up the indexes for this data grouped either by application:
Application 1 Index
- Application 1 logs
- Application 1 server perfmon data
- Application 1 RabbitMQ logs
Application 2 Index
- Application 2 logs
- Application 2 server perfmon data
- Application 2 RabbitMQ logs
Or we could set up the indexes by the type of data contained in them:
Application logs index
- Application 1 logs
- Application 2 logs
Server perfmon index
- Application 1 server perfmon data
- Application 2 server perfmon data
RabbitMQ index
- Application 1 rabbitMQ logs
- Application 2 rabbitMQ logs
Of these two setups, which one is generally considered to be the better option and why? Or is there some other index partitioning I should consider instead?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

You should probably base your indexes off of two criteria:
- Retention time. How long do you need to keep the data in this index.
- Security. If a group of users should be able to search only specific sets of data, then those sets of data need to be in their own index(es).
Other than that, either of the two methods you mentioned above would work. If it were me, and all other things being equal, I'd definitely go with method 2.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

One thing your should consider is about data access. If its OK for Application 2 users to access Application 1 data, your second approach will work just fine. If the data has to be secured from inter application user access, approach 1 would be good.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

You should probably base your indexes off of two criteria:
- Retention time. How long do you need to keep the data in this index.
- Security. If a group of users should be able to search only specific sets of data, then those sets of data need to be in their own index(es).
Other than that, either of the two methods you mentioned above would work. If it were me, and all other things being equal, I'd definitely go with method 2.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

part 2 -
Retention time - it's likely that you'll want to keep certain data longer than others. This is an index-level setting. If you want perfmon data for 60 days, but application data for 180 days, you'll need to separate your indexes accordingly.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Sure. Two parts because it's a long answer -
Basically, it's the two criteria I mentioned; retention time and security.
Security - Say you have a group of sysadmins that need to be able to search server perfmon data, but don't need any access to any of the other stuff. With method 1, you couldn't give the sysadmins access to just the perfmon data and nothing else. With method 2, you'd simply limit the sysadmins role to only be able to search the perfmon index.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you elaborate on why you'd pick method 2 over method 1?
