Getting Data In

indexer sizing and virtualization

Steve_Litras
Path Finder

Hi -

I'm embarking on a re-organization in my splunk environment. I've come into possession of a couple big x86 boxes (4 socket, 8 core, 256GB RAM), and given that I heard multiple times that indexing is better distributed horizontally across smaller boxes, it leads me to wonder this: If I replace two of my 3 indexers with these boxes, can splunk take advantageof the hardware? Or am I better off partitioning these boxes via virtualization into a number of smaller boxes (but still using local disk resources, etc.)?

Thanks
Steve

Tags (1)
0 Karma
1 Solution

lguinn2
Legend

Splunk can definitely use all the resources of the box. It is high-performance multi-threaded code, I would not virtualize the box; this will not help Splunk use it better. The overhead of virtualization will not be offset by any performance improvements. (I am a VMware-certified VCP4, if that helps my credibility on this one.)

Think of the Splunk advice as "Spend your money on lots of average-sized servers, rather than a few giant servers." Why?

  • average-sized "commodity" servers are relatively cheap. When you start buying "extra large" memory, etc., it tends to come at a premium price. So "commodity" servers are the most effective way to spend the money. (But your servers were free!)
  • more boxes gives you more concurrent IO - for Splunk indexers, IO is usually the bottleneck
  • Splunk is designed to be distributed; adding more indexers increases search performance nearly linearly

Lucky you - I wish someone would give me some physical servers (that weren't ready for the junk heap)! I would probably make them all indexers, unless my search head was overloaded. Adding indexers = reduced search time = more searches per minute = can serve more users effectively.

View solution in original post

0 Karma

lguinn2
Legend

Splunk can definitely use all the resources of the box. It is high-performance multi-threaded code, I would not virtualize the box; this will not help Splunk use it better. The overhead of virtualization will not be offset by any performance improvements. (I am a VMware-certified VCP4, if that helps my credibility on this one.)

Think of the Splunk advice as "Spend your money on lots of average-sized servers, rather than a few giant servers." Why?

  • average-sized "commodity" servers are relatively cheap. When you start buying "extra large" memory, etc., it tends to come at a premium price. So "commodity" servers are the most effective way to spend the money. (But your servers were free!)
  • more boxes gives you more concurrent IO - for Splunk indexers, IO is usually the bottleneck
  • Splunk is designed to be distributed; adding more indexers increases search performance nearly linearly

Lucky you - I wish someone would give me some physical servers (that weren't ready for the junk heap)! I would probably make them all indexers, unless my search head was overloaded. Adding indexers = reduced search time = more searches per minute = can serve more users effectively.

0 Karma

RicoSuave
Builder

You might want to check out this link http://docs.splunk.com/Documentation/Splunk/4.2.3/Installation/CapacityplanningforalargerSplunkdeplo...

The real question is, how much data are you indexing and how many users do you have? I would probably just set up the "big" boxes up as dedicated indexers and the lesser box as a dedicated search head. The biggest consideration is Disk I/O performance. If you have a lot of users, it might make more sense to make one of the big boxes a dedicated search head instead to accommodate more concurrent searches.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...