We have a regulatory requirement to keep logs for 3 years. I am currently architecting our Splunk infrastructure and trying to figure out how far I should go to protect our logs from corruption, drive/array failures, etc.
In my current design I'm planning to have 6 load balanced indexers (4 in main site and 2 in alternate), each with an external drive enclosure filled with 25 1.2TB drives. Each indexer will be filled with 8 300GB 15k rpm drives. Because of how I'm splitting up the enclosure where 10 drives are in RAID 10 and 14 are in RAID 5 (although I suppose this can be split up into two groups of RAID 5 arrays), we'll have a total of approx 91TB here and 45TB at alternate site.
This is enough storage to store data captured at 100GB per day for 3 years.
This design, of course, only protects us from a certain level of drive failures. It doesn't protect us from log data corruption.
From a regulatory perspective, we get slammed hard if we lose more than a reasonable number of logs due to corruption, but I'm not sure that warrants building out a separate backup network and doubling storage using a SAN in order to protect against the small chance of data corruption.
Is my current design reasonable? Let me know what you think.
A little bit off-topic, but could be interesting all the same.
If this is only for regulatory purposes, i.e. you do not really want to keep all those logs online and searchable, not even for an auditor to look at them, you should probably think about aging out your indexes to frozen after some time, say one year. Then you can have the frozen buckets on backup for another 2 years before deleting them permanently.
The main point is that frozen backups only take up around 10-15% (on average) of the original log size, whereas the warm/cold buckets can in some cases be larger than the original logs (average around 50% of original size), because of the .tsidx files that make them searchable. Frozen buckets do not save the .tsidx files, so the events therein are not directly searchable, but the .tsidx files can be rebuilt should the need occur.
As for the risk of tampering/losing logs, you should probably ask your auditor what s/he thinks is acceptable. AFAIK, you can not have data block signing if your index is spread across more than one indexer.