While explaining someone about splunk, I wondered how to explain about the meaning of creating a separate index.
I feel like I know when to create a new index for my data, but are there any rules about it? Is it about the type of the data I’m indexing or the schema of it?
Hope someone can give a good explanation to someone who’s new to splunk.
From the splexicon:
The repository for data in Splunk Enterprise. When Splunk Enterprise indexes raw event data, it transforms the data into searchable events. Indexes reside in flat files on the Splunk Enterprise instance known as the indexer.
There are "no rules" for it. You just put data in. Schema is built on the fly, so don't worry about that at the indexing stage. You can use your separate index for access control separation for example 🙂
@omerl, Creating separate indexes in Splunk adds several features:
Access management - Different users roles can be setup to have access to difference indexes. For example Security Team should have access to Security related index and Application/Sales Team should have access to Sales/Web Logs.
Data retention period - Index can be setup to have separate retention and rollover windows as per use case. For example Sales Data should to be retained for minimum 1 year. Security Audit information to be retained for 5-10 years etc.
Storage based on access frequency - Indexes roll over from Hot/Warm(fast/frequent) to Cold buckets based on retention period/Size. Less frequently accessed data can be moved to relatively slower disk based on this architecture.
Index Compression - Data being indexed can be compressed for Storage Optimization
Faster Search - Index defined in Search SPL will filter out events for specific index.
Summary Index - A summary index can store statistical summaries from actual index for faster search, historical data retention and prediction (with various mechanisms like Accelerate Report/Data Models, collect command etc).
Metrics Index - Splunk 7.0 introduced Metrics Index that can store data via metrics data based protocols like statsd and collectd. And perform up to 200 times better that regular Splunk index.
You should check out Splunk Documentation for these for details and definitely go through Splunk Search Fundamentals 1 course which is Free.