Hi All,
We have a regulatory requirement to keep logs for 3 years. I am currently architecting our Splunk infrastructure and trying to figure out how far I should go to protect our logs from corruption, drive/array failures, etc.
In my current design I'm planning to have 6 load balanced indexers (4 in main site and 2 in alternate), each with an external drive enclosure filled with 25 1.2TB drives. Each indexer will be filled with 8 300GB 15k rpm drives. Because of how I'm splitting up the enclosure where 10 drives are in RAID 10 and 14 are in RAID 5 (although I suppose this can be split up into two groups of RAID 5 arrays), we'll have a total of approx 91TB here and 45TB at alternate site.
This is enough storage to store data captured at 100GB per day for 3 years.
This design, of course, only protects us from a certain level of drive failures. It doesn't protect us from log data corruption.
From a regulatory perspective, we get slammed hard if we lose more than a reasonable number of logs due to corruption, but I'm not sure that warrants building out a separate backup network and doubling storage using a SAN in order to protect against the small chance of data corruption.
Is my current design reasonable? Let me know what you think.
Regards,
Alex
... View more