Hi All,
Our company has recently hopped on the Splunk bandwagon and we've set up a small distributed environment of 1 x search head, 2 x indexers and an intermediate forwarder for data filtering/future scaling.. indexes per business function/etc..
My question is where to scale from here....?!
I see many different options and I'd like to get some input on how various sites are scaled beyond an initial set of indexers. (I'm looking to plan this correctly from the word go...) Currently we have requirements for more data/speed/etc..
I see the simplest and best utilization of vm resources by simply scaling indexers horizontally in a single data pool.... ie. Increase my 2 indexers to 4 indexers and continue to strip all indexes across these 4 Indexers... Effectively(minus some overhead) this should also double (or say 1.8x) my performance given the way Splunk assigns cpu threads per searches (- right?)....
So 4 indexers seems reasonable, I can see that 6 or 8 may work well in this configuration also...
Is this is the way to go? Striping all my indexed data across say up to a 10 indexers - surely performance must be great for larger searches? Beyond this the performance benefit of adding an additional indexer to the pool seems negligible...Offcourse this increases the point of failure significantly ;(
Does anyone out there have a single index spread across 8 or more indexers? How does it perform?
Is a better method to group indexes across a subset of indexers say perhaps on busniess functions.. for example: HR indexes --> Indexers A,B,C,D,E ; Web Servers --> F,G,H,I,J ? (etc)
Any input would be appreciated... in fact are there any topologies of large sites available from Splunk or another source....
Thanks,
I think the best thing is to contact Splunk and maybe bring in a professional services guy (or gal!) to help you out. There is much to be said on this subject, some gotchas, and while it certainly would be possible to try to put as much as possible covering as many situations as possible into an answer here I really think that with the size of your deployment the absolute best is to get thoughts on your specific situation straight from people who deal with this stuff all the time.
Yes, that's an option too... There's good info about requirements (per data) for search heads and indexers eg. quantity to support data volume, but I'm still looking for a good reference to validate our topology... Thanks...