Monitoring Splunk

Distributed environment scaling via commodity hardware or one big giant virtual machine?

deodion
Path Finder

We have seen the reference on hardware spec for performance and scaling, how about this below:

What is the difference between (lets say):
3 x servers with spec:
12 physical cores, 32GB RAM, 800 IOPS per server

versus

ONE BIG giant virtual machine with spec:
36 physical cores, 96GB RAM, 2400 IOPS

Many thanks,

0 Karma

sdvorak_splunk
Splunk Employee
Splunk Employee
0 Karma

sdvorak_splunk
Splunk Employee
Splunk Employee

Unless, you are dedicating resources on the virtual machine, it is unlikely that will perform as well as 3 smaller physical machines. In fact, you will incur a 10% performance penalty for indexing by simply running virtually (worse if there is contention for CPU, RAM, or disk).
Also, a single indexing pipeline would use 4 CPUs per machine, thus 3 servers would have 12 CPU worth of indexing horsepower. As of 6.3 you can have up to 2 (max recommended) indexing pipelines which would give you 8 CPU worth of indexing horsepower on the "big server", which would be less than the 3 smaller servers, but would leave more CPUs available for handling search. If that meets your needs, it's probably fine.
Personally, if I was setting this up, I would want the 3 servers with known dedicated resources, and the inherent redundancy that is associated with it. But if you are going for ease of use and simplified maintenance/administration, 1 server does fit the bill.

Info on indexing parallelization:
http://docs.splunk.com/Documentation/Splunk/6.3.0/Capacity/Parallelization

0 Karma

sdvorak_splunk
Splunk Employee
Splunk Employee

I should add, if you go with the larger system, your expansion options are to add another identical system. While with 3 smaller servers, you could add a single identical smaller system.

0 Karma

deodion
Path Finder

Where can I get that reference doc/link: a single indexing pipeline would use 4 CPUs per machine?

I never heard before if one indexer has maximum usability of indexing processing, or perhaps also for searching.

Thanks,

0 Karma

sdvorak_splunk
Splunk Employee
Splunk Employee

I'm not sure there is a reference in the docs to the 4 processor usage for indexing. However, there are four distinct queues and this process for the indexing pipeline. Each of those tends to leverage the better part of a CPU. That is the reason we say that 4 CPUs are used per pipeline.

0 Karma

deodion
Path Finder

You mean these 4 distinct queues are the one we usually seen in ppt slides explaining about indexing process like typing, parsing, etc (if i can recall...not sure. Will look back)

0 Karma

hardikJsheth
Motivator

FIrst one will be better, as we can create cluster and use the power of Splunk replication.

0 Karma

deodion
Path Finder

Yes that is obvious. I wasnt try to look after in that cluster feature.

Im talking about performance and its overall plus minus of each.

Thanks btw

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.