Hi. We are looking to estimate retention needs for a new Splunk Cloud deployment. I'm looking at using the following formula:
total space required = (retention time in days)x(ingestion/day)x[0.15RF + 0.35SF]
where RF/SF are the index clustering replication/search factors, and 0.15/0.35 are the commonly adopted compression ratios for rawdata/tsidx data. I'm aware that to calculate the extra storage to be purchased, I can subtract the 90 days retention that comes with the license.
So really I need to know: is there a default set of values for RF and SF in Splunk Cloud deployments? Is it just the default on-prem indexer clustering values of RF=3, SF=2? The values could potentially make a big difference to retention needs but I don't see them mentioned in the Cloud docs.
before i start i will say that i dont work for splunk and i highly recommend to consult your splunk SE as it seems you are interesting in purchasing the Splunk Cloud offering
i think that above a certain amount of GB per day license, Splunk Cloud indexers will be clustered. if you need to purchase less than that amount and have clustering requirement, contact your Splunk SE.
now, consider that many times, there are some kinds (sourcetypes) that will require longer retention, while other will not.
Splunk Cloud will give you 90 days X License as total storage. lets say 100GB X 90 is ~ 9TB
consider for example that you have audit data at 50gb per day that you need long retention on. but you also have 50gb of operational data that 30 days or less are required. in that case, you will use only 30 days X 50gb (operational data) which is ~1.5 TB and will have 7.5 TB for the rest 50gb that will give you ~75 days of retention.
what i am trying to say is, that more detailed inspection of your data and retention needs and applying proper configurations can sometimes save you big $$$$