Monitoring Splunk

Distributed environment scaling via commodity hardware or one big giant virtual machine?

Path Finder

We have seen the reference on hardware spec for performance and scaling, how about this below:

What is the difference between (lets say):
3 x servers with spec:
12 physical cores, 32GB RAM, 800 IOPS per server

versus

ONE BIG giant virtual machine with spec:
36 physical cores, 96GB RAM, 2400 IOPS

Many thanks,

0 Karma

Splunk Employee
Splunk Employee
0 Karma

Splunk Employee
Splunk Employee

Unless, you are dedicating resources on the virtual machine, it is unlikely that will perform as well as 3 smaller physical machines. In fact, you will incur a 10% performance penalty for indexing by simply running virtually (worse if there is contention for CPU, RAM, or disk).
Also, a single indexing pipeline would use 4 CPUs per machine, thus 3 servers would have 12 CPU worth of indexing horsepower. As of 6.3 you can have up to 2 (max recommended) indexing pipelines which would give you 8 CPU worth of indexing horsepower on the "big server", which would be less than the 3 smaller servers, but would leave more CPUs available for handling search. If that meets your needs, it's probably fine.
Personally, if I was setting this up, I would want the 3 servers with known dedicated resources, and the inherent redundancy that is associated with it. But if you are going for ease of use and simplified maintenance/administration, 1 server does fit the bill.

Info on indexing parallelization:
http://docs.splunk.com/Documentation/Splunk/6.3.0/Capacity/Parallelization

0 Karma

Splunk Employee
Splunk Employee

I should add, if you go with the larger system, your expansion options are to add another identical system. While with 3 smaller servers, you could add a single identical smaller system.

0 Karma

Path Finder

Where can I get that reference doc/link: a single indexing pipeline would use 4 CPUs per machine?

I never heard before if one indexer has maximum usability of indexing processing, or perhaps also for searching.

Thanks,

0 Karma

Splunk Employee
Splunk Employee

I'm not sure there is a reference in the docs to the 4 processor usage for indexing. However, there are four distinct queues and this process for the indexing pipeline. Each of those tends to leverage the better part of a CPU. That is the reason we say that 4 CPUs are used per pipeline.

0 Karma

Path Finder

You mean these 4 distinct queues are the one we usually seen in ppt slides explaining about indexing process like typing, parsing, etc (if i can recall...not sure. Will look back)

0 Karma

Motivator

FIrst one will be better, as we can create cluster and use the power of Splunk replication.

0 Karma

Path Finder

Yes that is obvious. I wasnt try to look after in that cluster feature.

Im talking about performance and its overall plus minus of each.

Thanks btw

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!