Getting Data In

Storage experts: With 20 SSDs per indexer, what's the best RAID option?

twinspop
Influencer

These will be running SUSE 12. Each SSD will be 1.6TB. The systems have hardware RAID cards, but I'm tempted to go with JBOD, and use Linux tools or even ZFS to manage the volumes.

  • RAID50? eg, RAID5 with 5 members, 4 groups, striped
  • RAID60?
  • Multiple RAIDZ1 or -2 with ZFS?

Our storage group recommended one giant RAID5 volume, which worries me. Rebuild on a volume that size seems to be a problem, and losing a second drive during rebuild would be a real possibility. Not to mention having 1 drive failure protection in a 20 drive array seems like a bad idea.

EDIT - I'm trying to avoid RAID10, losing 50% of the raw storage.

0 Karma
1 Solution

masonmorales
Influencer

We use RAID5 on our indexers, which are 20x 1.92 TB SSDs. Rebuild time is ~4 hours or so in our environment, but that depends on whether you are using hardware vs software RAID, CPU speed, etc. We are also in an indexer cluster, so we can afford an indexer being down for a rebuild that will take several hours.

For the file system, performance-wise there is no difference. We use XFS.

Are you going to be clustering your indexers? If so, there's really no reason not to go with RAID 5.

If you are in a non-clustered environment, RAID50 would work fine as well.

View solution in original post

twinspop
Influencer

Follow-up: RAID5 was okay at first, but the relatively poor IO perf caught up with us. Eventually I had to re-create the volumes as RAID10. SmartStore made this fairly easy. We just updated these servers and went with fewer drives in RAID0, relying on remote storage (S2) and clustering for all redundancy.

0 Karma

masonmorales
Influencer

If you're interested in performance differences, you can check out the .Conf 2016 talk I did, "Architecting Splunk for Epic Performance at Blizzard Entertainment" at https://conf.splunk.com/sessions/2016-sessions.html

0 Karma

masonmorales
Influencer

We use RAID5 on our indexers, which are 20x 1.92 TB SSDs. Rebuild time is ~4 hours or so in our environment, but that depends on whether you are using hardware vs software RAID, CPU speed, etc. We are also in an indexer cluster, so we can afford an indexer being down for a rebuild that will take several hours.

For the file system, performance-wise there is no difference. We use XFS.

Are you going to be clustering your indexers? If so, there's really no reason not to go with RAID 5.

If you are in a non-clustered environment, RAID50 would work fine as well.

twinspop
Influencer

We are clustered. Currently 5 (in 2 different clusters). Soon to be 12 each. Thanks for your input!

0 Karma

masonmorales
Influencer

What's your RF/SF?

0 Karma

twinspop
Influencer

For this project we plan to be RF3/SF2.

0 Karma
Get Updates on the Splunk Community!

The Splunk Success Framework: Your Guide to Successful Splunk Implementations

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...