There are other considerations besides sources or sourcetypes. Indexes should be set up based on data retention periods and access rules. In other words, sourcetypes that must be retained for different times must be in different indexes. Similarly, if access to some data is to be limited to certain users/roles then that data must be in a separate index.
Thank you for the reply that makes sense, but going back to the number of sources / source types per index, is there an optimal sources, assuming your previously mentioned considerations are not an issue?
For instance you would not want everything in main.
Besides data retention, user roles, and grouping related sources together (for correlation) is there a number or volume data to consider per index?
There are some who say nothing should be in the main index as it means no thought was put into how the data should be managed.
I supposed one could make a case for putting related sources into the same index so fewer buckets have to be scanned (assuming timestamps are relatively close), but I've never seen anything about optimal ratios of sources to indexes.
Thank you for the awesome advice (I will implement your suggestions).
Before I accept your answer, are you suggesting that each source should be in a separate index? even if they are related for correlation reasons?
In other words, if you have 1to1 index to source (each source in its own index)? Would a high number of indexes cause a problem? This is more of an academic question.
Have you seen/can you refer me to any Splunk docs with examples for indexing sources?
I am not suggesting having each source in a separate index. If that makes sense for you then go ahead.
Having a high number of indexes can cause a problem. Having many indexes and many buckets within those indexes could lead to Splunk running out of file handles.