When planning my Splunk deployment, I've been told that the storage volume is probably the most important aspect. Why is this and what is the recommended hardware?
We recommend storage that provides a very high number of random input/output operations per second (IOPS). Storage bandwidth is less of a consideration as almost any hardware is capable of providing the required throughput.
Striped disks provide high IOPS because requests are likely to be distributable over a greater number of disk spindles.
We recommend RAID 10 storage for the Splunk hot/warm index volumes as it provides the high IOPS achievable by striping over many disks while mirroring reduces the disk of data loss due to single-drive failures. Mirroring may also provide higher read IOPS.
RAID 5 storage is perfectly acceptable for cold volumes. While RAID 5 and (RAID 6) often has greatly reduced IOPS on writes and is therefore unsuitable for the hot volume, this is less of a problem with cold volumes which are written only rarely, and usually only in large blocks.
Improving IOPS with the use of disk caches may be possible, but it should be noted that small caches will likely be completely ineffective because of the large volumes of data that are typically involved in Splunk systems (at least, on those Splunk systems where the volumes are large enough where we would take such considerations into account).
Note also that we also recommend using multiple Splunk indexers and distributing data and searches over them to improve search performance. It should be more cost-effective to achieve high Splunk performance by using many machines with "good" storage, than just a single with "absolute fastest" storage possible.
For storage performance from a physical hardware perspective I would provide a few recommendations, most of them pretty straight forward.
We recommend storage that provides a very high number of random input/output operations per second (IOPS). Storage bandwidth is less of a consideration as almost any hardware is capable of providing the required throughput.
Striped disks provide high IOPS because requests are likely to be distributable over a greater number of disk spindles.
We recommend RAID 10 storage for the Splunk hot/warm index volumes as it provides the high IOPS achievable by striping over many disks while mirroring reduces the disk of data loss due to single-drive failures. Mirroring may also provide higher read IOPS.
RAID 5 storage is perfectly acceptable for cold volumes. While RAID 5 and (RAID 6) often has greatly reduced IOPS on writes and is therefore unsuitable for the hot volume, this is less of a problem with cold volumes which are written only rarely, and usually only in large blocks.
Improving IOPS with the use of disk caches may be possible, but it should be noted that small caches will likely be completely ineffective because of the large volumes of data that are typically involved in Splunk systems (at least, on those Splunk systems where the volumes are large enough where we would take such considerations into account).
Note also that we also recommend using multiple Splunk indexers and distributing data and searches over them to improve search performance. It should be more cost-effective to achieve high Splunk performance by using many machines with "good" storage, than just a single with "absolute fastest" storage possible.
To add color here: many searches are seek-dominated (needle-in-haystack), which means you want a lot of IOPS. For the storage system where you are indexing, the data is actually written multiple times, so the high cost-per-write of raid5 with the desire for low latency searches on the same disk is not ideal.
The faster your storage, the faster Splunk will be able to index and search data. A high i/o capability is a must. Deployment docs recommend 10K disks with RAID 10, but if you can get a faster SAN then that will work too.
Steer well clear of RAID 5, it just doesn't perform well