Getting Data In

indexer sizing and virtualization

Steve_Litras
Path Finder

Hi -

I'm embarking on a re-organization in my splunk environment. I've come into possession of a couple big x86 boxes (4 socket, 8 core, 256GB RAM), and given that I heard multiple times that indexing is better distributed horizontally across smaller boxes, it leads me to wonder this: If I replace two of my 3 indexers with these boxes, can splunk take advantageof the hardware? Or am I better off partitioning these boxes via virtualization into a number of smaller boxes (but still using local disk resources, etc.)?

Thanks
Steve

Tags (1)
0 Karma
1 Solution

lguinn2
Legend

Splunk can definitely use all the resources of the box. It is high-performance multi-threaded code, I would not virtualize the box; this will not help Splunk use it better. The overhead of virtualization will not be offset by any performance improvements. (I am a VMware-certified VCP4, if that helps my credibility on this one.)

Think of the Splunk advice as "Spend your money on lots of average-sized servers, rather than a few giant servers." Why?

  • average-sized "commodity" servers are relatively cheap. When you start buying "extra large" memory, etc., it tends to come at a premium price. So "commodity" servers are the most effective way to spend the money. (But your servers were free!)
  • more boxes gives you more concurrent IO - for Splunk indexers, IO is usually the bottleneck
  • Splunk is designed to be distributed; adding more indexers increases search performance nearly linearly

Lucky you - I wish someone would give me some physical servers (that weren't ready for the junk heap)! I would probably make them all indexers, unless my search head was overloaded. Adding indexers = reduced search time = more searches per minute = can serve more users effectively.

View solution in original post

0 Karma

lguinn2
Legend

Splunk can definitely use all the resources of the box. It is high-performance multi-threaded code, I would not virtualize the box; this will not help Splunk use it better. The overhead of virtualization will not be offset by any performance improvements. (I am a VMware-certified VCP4, if that helps my credibility on this one.)

Think of the Splunk advice as "Spend your money on lots of average-sized servers, rather than a few giant servers." Why?

  • average-sized "commodity" servers are relatively cheap. When you start buying "extra large" memory, etc., it tends to come at a premium price. So "commodity" servers are the most effective way to spend the money. (But your servers were free!)
  • more boxes gives you more concurrent IO - for Splunk indexers, IO is usually the bottleneck
  • Splunk is designed to be distributed; adding more indexers increases search performance nearly linearly

Lucky you - I wish someone would give me some physical servers (that weren't ready for the junk heap)! I would probably make them all indexers, unless my search head was overloaded. Adding indexers = reduced search time = more searches per minute = can serve more users effectively.

View solution in original post

0 Karma

RicoSuave
Builder

You might want to check out this link http://docs.splunk.com/Documentation/Splunk/4.2.3/Installation/CapacityplanningforalargerSplunkdeplo...

The real question is, how much data are you indexing and how many users do you have? I would probably just set up the "big" boxes up as dedicated indexers and the lesser box as a dedicated search head. The biggest consideration is Disk I/O performance. If you have a lot of users, it might make more sense to make one of the big boxes a dedicated search head instead to accommodate more concurrent searches.

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!