Hello all.
We are about to implement a new clustered environment and the way we have indexes are concerning us on if we're doing the right thing or not. We currently have an index per application varying from OS Infra data to logs to special perfmon commands for web servers, etc. I recall while reading the docs that the master said something about only handling like 8 indexes nicely on a beefy server.
Our question is, how many indexes can a clustered environment (master node, 3 indexers, 3 search heads, and a deployment server) handle nicely? If we stick to having a few indexes per app + OS index, were looking at this kind of math:
1 OS Index + (2 Indexes * 450 Applications) = at least 901 indexes. Is that an insane amount of indexes?
(PS: we currently only have a single instance with about 15 indexes total and planning to expand to the above math over the next year or 2)
900 indexes should be fine! we've tested with more than that internally. the number of buckets is what starts causing some slowdowns/issues (over 100k) - see http://answers.splunk.com/answers/233441/cluster-master-is-unable-to-meet-search-factor-and.html
I don't believe there is a limit on the no of indexes as such. If you worry about the performance (I assume that's what you meant by 'handle nicely', it all depends on how your data is coming and how you write your searches. Points to remember here is that more indexes will cause more number of buckets to be created. If your searches don't specify index=IndexName, then it has to search across all buckets to find your data and thus bad performance.
Thanks for the response, yes i was meaning performance with 'handle nicely'. I'm just unsure if having a large amount of buckets caused by 1000 indexes would affect the index cluster at all. And yes, we typically search index=indexName. Only thing we won't be searching on like that is OS index and we plan to look into tags for that index as someone recommended that.
Well in clustered environment, more the cluster, more buckets to be replicated. What I would suggest is that based on how much data comes to an index, keep the bucket rolling settings like maxDataSize and maxHostSpanSecs to values so that there are lesser buckets rolled over to next stage. See more details on data bucket lifecycle here
http://wiki.splunk.com/Deploy:BucketRotationAndRetention