Getting Data In

Is there a recommendation for how many sourcetypes an index should have?

packet_hunter
Contributor

I imagine that if you would have one index per sourcetype that would cause problems, however is there an ideal number of sources for an index?

Thank you

0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

There are other considerations besides sources or sourcetypes. Indexes should be set up based on data retention periods and access rules. In other words, sourcetypes that must be retained for different times must be in different indexes. Similarly, if access to some data is to be limited to certain users/roles then that data must be in a separate index.

---
If this reply helps you, Karma would be appreciated.

View solution in original post

0 Karma

richgalloway
SplunkTrust
SplunkTrust

There are other considerations besides sources or sourcetypes. Indexes should be set up based on data retention periods and access rules. In other words, sourcetypes that must be retained for different times must be in different indexes. Similarly, if access to some data is to be limited to certain users/roles then that data must be in a separate index.

---
If this reply helps you, Karma would be appreciated.
0 Karma

packet_hunter
Contributor

Thank you for the reply that makes sense, but going back to the number of sources / source types per index, is there an optimal sources, assuming your previously mentioned considerations are not an issue?

For instance you would not want everything in main.

Besides data retention, user roles, and grouping related sources together (for correlation) is there a number or volume data to consider per index?

Thank you

0 Karma

richgalloway
SplunkTrust
SplunkTrust

There are some who say nothing should be in the main index as it means no thought was put into how the data should be managed.

I supposed one could make a case for putting related sources into the same index so fewer buckets have to be scanned (assuming timestamps are relatively close), but I've never seen anything about optimal ratios of sources to indexes.

---
If this reply helps you, Karma would be appreciated.
0 Karma

packet_hunter
Contributor

Thank you for the awesome advice (I will implement your suggestions).

Before I accept your answer, are you suggesting that each source should be in a separate index? even if they are related for correlation reasons?

In other words, if you have 1to1 index to source (each source in its own index)? Would a high number of indexes cause a problem? This is more of an academic question.

Have you seen/can you refer me to any Splunk docs with examples for indexing sources?

Thank you

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I am not suggesting having each source in a separate index. If that makes sense for you then go ahead.

Having a high number of indexes can cause a problem. Having many indexes and many buckets within those indexes could lead to Splunk running out of file handles.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...