Solved: indexer sizing and virtualization

Steve_Litras · ‎08-31-2011

Hi -

I'm embarking on a re-organization in my splunk environment. I've come into possession of a couple big x86 boxes (4 socket, 8 core, 256GB RAM), and given that I heard multiple times that indexing is better distributed horizontally across smaller boxes, it leads me to wonder this: If I replace two of my 3 indexers with these boxes, can splunk take advantageof the hardware? Or am I better off partitioning these boxes via virtualization into a number of smaller boxes (but still using local disk resources, etc.)?

Thanks
Steve

lguinn2 · ‎09-03-2011

Splunk can definitely use all the resources of the box. It is high-performance multi-threaded code, I would not virtualize the box; this will not help Splunk use it better. The overhead of virtualization will not be offset by any performance improvements. (I am a VMware-certified VCP4, if that helps my credibility on this one.)

Think of the Splunk advice as "Spend your money on lots of average-sized servers, rather than a few giant servers." Why?

average-sized "commodity" servers are relatively cheap. When you start buying "extra large" memory, etc., it tends to come at a premium price. So "commodity" servers are the most effective way to spend the money. (But your servers were free!)
more boxes gives you more concurrent IO - for Splunk indexers, IO is usually the bottleneck
Splunk is designed to be distributed; adding more indexers increases search performance nearly linearly

Lucky you - I wish someone would give me some physical servers (that weren't ready for the junk heap)! I would probably make them all indexers, unless my search head was overloaded. Adding indexers = reduced search time = more searches per minute = can serve more users effectively.

View solution in original post

lguinn2 · ‎09-03-2011

Splunk can definitely use all the resources of the box. It is high-performance multi-threaded code, I would not virtualize the box; this will not help Splunk use it better. The overhead of virtualization will not be offset by any performance improvements. (I am a VMware-certified VCP4, if that helps my credibility on this one.)

Think of the Splunk advice as "Spend your money on lots of average-sized servers, rather than a few giant servers." Why?

average-sized "commodity" servers are relatively cheap. When you start buying "extra large" memory, etc., it tends to come at a premium price. So "commodity" servers are the most effective way to spend the money. (But your servers were free!)
more boxes gives you more concurrent IO - for Splunk indexers, IO is usually the bottleneck
Splunk is designed to be distributed; adding more indexers increases search performance nearly linearly

Lucky you - I wish someone would give me some physical servers (that weren't ready for the junk heap)! I would probably make them all indexers, unless my search head was overloaded. Adding indexers = reduced search time = more searches per minute = can serve more users effectively.

RicoSuave · ‎09-01-2011

You might want to check out this link http://docs.splunk.com/Documentation/Splunk/4.2.3/Installation/CapacityplanningforalargerSplunkdeplo...

The real question is, how much data are you indexing and how many users do you have? I would probably just set up the "big" boxes up as dedicated indexers and the lesser box as a dedicated search head. The biggest consideration is Disk I/O performance. If you have a lot of users, it might make more sense to make one of the big boxes a dedicated search head instead to accommodate more concurrent searches.

indexer sizing and virtualization

Mastering Data Pipelines: Unlocking Value with Splunk

The Latest Cisco Integrations With Splunk Platform!

AI Adoption Hub Launch | Curated Resources to Get Started with AI in Splunk