Deployment Architecture

Any more info on the i/o requirements in the deployment considerations?

Mick
Splunk Employee
Splunk Employee

The planning docs here - http://www.splunk.com/base/Documentation/latest/Installation/CapacityplanningforalargerSplunkdeploym... - recommend the following storage hardware :

4x300GB SAS hard disks at 10,000 rpm each in RAID 10 * capable of 800 IO operations / second (IOPS)

Can anyone clarify this a bit further please? We need more I/O information: The 800IOPS sustained - is that read or write? Random or sequential? Large or small block?

1 Solution

jrodman
Splunk Employee
Splunk Employee

When measuring maximum IO/s per second, the bounding factor is typically seeks, which gives the same value for both reads and writes, and is only meaningful for the random case. Most testing tools will use a mix of reads and writes to attempt to exercise the device's real capabilities. Bonnie++ will do a majority of a reads and a minority of writes, which is pretty similar to Splunk's overall load in a well-used deployment, or in the case where a few searches are actually running, which is what you want to optimize for anyway.

As for large or small blocks, I'm not sure what will produce numbers closer to Splunk's behavior. Usually block size will affect throughput much more significantly, and IO/s not so much.


As for bonnie++, I typically would get the memory size of the box (not the vm) and then run

bonnie++ -d /somewhere/on/your/intended/fs -r 16384 -b

this is for a 4GB box. Essentially I'm telling bonnie++ to do 4x work than it would autotune for, to confidentaly defeat various forms of caching.

View solution in original post

ChrisG
Splunk Employee
Splunk Employee

Note that as of 5.0.2, the topic referred to in this question has been substantially revised and expanded.

0 Karma

greich
Communicator

field 18, Random seeks.
alternatively, you can pretty print the last line of results using script including in bonnie++ e.g.:
echo "1.96,1.96,hostname,1,1365178857,192G,,,,blah,,,123456789,,," | bon_csv2html>bonnie_results.html
And look at the "Random Seeks" /sec

0 Karma

glitchcowboy
Path Finder

I'm also trying to plan for a splunk deployment and am struggling with a couple things regarding disk IO. My test box has 48GB Memory, so a 2x RAM test is 96GB, which takes a while. With bonnie++ which field should I reference for the IOPS to compare to the recommended 800 IOPs?

stefanlasiewski
Contributor

IOPS is mostly a factor of disk/RAID performance and typically won't use all that RAM. I also have a server with many CPUs and a large amount of RAM, and my Splunk server is very slow because due to disk I/O. I didn't have much guidance when I first set up my server, but now Splunk has some excellent docs and pointers at http://docs.splunk.com/Documentation/Splunk/5.0.2/Installation/Referencehardware

dragmore
Explorer

It would be awesome if someone from splunk made a few bonnie parameter examples that would simulate a few real-world examples of usage from splunk!!

jrodman
Splunk Employee
Splunk Employee

No bonnie will really simulate the splunk load. However, I can try to figure out (remember) the more useful aspects of its invocation.

0 Karma

jrodman
Splunk Employee
Splunk Employee

When measuring maximum IO/s per second, the bounding factor is typically seeks, which gives the same value for both reads and writes, and is only meaningful for the random case. Most testing tools will use a mix of reads and writes to attempt to exercise the device's real capabilities. Bonnie++ will do a majority of a reads and a minority of writes, which is pretty similar to Splunk's overall load in a well-used deployment, or in the case where a few searches are actually running, which is what you want to optimize for anyway.

As for large or small blocks, I'm not sure what will produce numbers closer to Splunk's behavior. Usually block size will affect throughput much more significantly, and IO/s not so much.


As for bonnie++, I typically would get the memory size of the box (not the vm) and then run

bonnie++ -d /somewhere/on/your/intended/fs -r 16384 -b

this is for a 4GB box. Essentially I'm telling bonnie++ to do 4x work than it would autotune for, to confidentaly defeat various forms of caching.

Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...