Monitoring Splunk
Provide Splunk Cloud feedback in this confidential UX survey by June 17
for a chance to win a $200 Amazon gift card!

Distributed environment scaling via commodity hardware or one big giant virtual machine?

deodion
Path Finder

We have seen the reference on hardware spec for performance and scaling, how about this below:

What is the difference between (lets say):
3 x servers with spec:
12 physical cores, 32GB RAM, 800 IOPS per server

versus

ONE BIG giant virtual machine with spec:
36 physical cores, 96GB RAM, 2400 IOPS

Many thanks,

0 Karma

sdvorak_splunk
Splunk Employee
Splunk Employee
0 Karma

sdvorak_splunk
Splunk Employee
Splunk Employee

Unless, you are dedicating resources on the virtual machine, it is unlikely that will perform as well as 3 smaller physical machines. In fact, you will incur a 10% performance penalty for indexing by simply running virtually (worse if there is contention for CPU, RAM, or disk).
Also, a single indexing pipeline would use 4 CPUs per machine, thus 3 servers would have 12 CPU worth of indexing horsepower. As of 6.3 you can have up to 2 (max recommended) indexing pipelines which would give you 8 CPU worth of indexing horsepower on the "big server", which would be less than the 3 smaller servers, but would leave more CPUs available for handling search. If that meets your needs, it's probably fine.
Personally, if I was setting this up, I would want the 3 servers with known dedicated resources, and the inherent redundancy that is associated with it. But if you are going for ease of use and simplified maintenance/administration, 1 server does fit the bill.

Info on indexing parallelization:
http://docs.splunk.com/Documentation/Splunk/6.3.0/Capacity/Parallelization

0 Karma

sdvorak_splunk
Splunk Employee
Splunk Employee

I should add, if you go with the larger system, your expansion options are to add another identical system. While with 3 smaller servers, you could add a single identical smaller system.

0 Karma

deodion
Path Finder

Where can I get that reference doc/link: a single indexing pipeline would use 4 CPUs per machine?

I never heard before if one indexer has maximum usability of indexing processing, or perhaps also for searching.

Thanks,

0 Karma

sdvorak_splunk
Splunk Employee
Splunk Employee

I'm not sure there is a reference in the docs to the 4 processor usage for indexing. However, there are four distinct queues and this process for the indexing pipeline. Each of those tends to leverage the better part of a CPU. That is the reason we say that 4 CPUs are used per pipeline.

0 Karma

deodion
Path Finder

You mean these 4 distinct queues are the one we usually seen in ppt slides explaining about indexing process like typing, parsing, etc (if i can recall...not sure. Will look back)

0 Karma

hardikJsheth
Motivator

FIrst one will be better, as we can create cluster and use the power of Splunk replication.

0 Karma

deodion
Path Finder

Yes that is obvious. I wasnt try to look after in that cluster feature.

Im talking about performance and its overall plus minus of each.

Thanks btw

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!