How should I allocate space for indexes among indexing nodes? For example, lets say I have 2 groups of servers that will write to two separate indexes, indexa and indexb. I know each will log 50 GB/day, I want to keep data from both indexes for 30 days. This means I will need approximately (50GB/day*30days*.5 compression) 750 GB of total space for each index. If I have two indexing nodes, should I split the storage up evenly by creating indexa and indexb on both servers, and set a max size of 375 GB on each (750GB/2)? Are there any caveats as we add more indexes or more search nodes? Do we need to over allocate space to account for potential unevenness in the load balancing (ie, server1 receives more traffic than server2)?
Yes, you should split it evenly between both, assuming you're using Splunk forwarders to load-balance between the two. You should allocated more storage for unevenness, but this is hard to say. For example, a server going down, loading of historical or archived data, or writes of a large file all at once (vs appending of a log file over time) will tend to cause unevenness. As the number of servers goes up, unevenness in volume across indexers tends to go down.
Yes, you should split it evenly between both, assuming you're using Splunk forwarders to load-balance between the two. You should allocated more storage for unevenness, but this is hard to say. For example, a server going down, loading of historical or archived data, or writes of a large file all at once (vs appending of a log file over time) will tend to cause unevenness. As the number of servers goes up, unevenness in volume across indexers tends to go down.