Splunk DB dimensioning



I’m doing storage dimensioning for our Indexer cluster as follows
- number of log events ingested per day, and
- Average size of each log events
- how much the disk space of $SPLUNK_DB has increase in 1 day

Previously, in order the obtain the delta in diskspace, I simply took 2 snapshots 24 hrs apart. But now that our data has reached retention age, with oldest data getting deleted everyday, I can no longer do that.

I’ve tried Fire Brigade TA, but it didn’t give me what I need. So, I’m down to 2 options:
- asking our customer to temporary increase the retention time by a few days so that the logs don’t get truncated, or
- manually searching for all buckets having data within the 1-day time range and find their size

Would anyone have gone through this exercise and found a simpler way to obtain this estimation?


Re: Splunk DB dimensioning

You can actually reference the license usage logs for this:

index=_internal source=*license_usage.log type=Usage earliest=-1d@d latest=-0d@d | stats sum(b) by st, idx | rename sum(b) as Bytes  | eval Volume=round(Bytes/1024/1024, 2) | eventstats sum(Volume) as Total_Volume | fields - Bytes  | fieldformat Volume = tostring(Volume, "commas") +"MB" | fieldformat Total_Volume = tostring(Total_Volume, "commas") +"MB" | rename st as Sourcetype, idx as Index
Re: Splunk DB dimensioning


Actually, I'm not looking for the ingested log volume per day, but the disk space consumption on the indexer cluster, meaning the increase in these folders:
In our deployment, we have replication-factor=2, search-factor=2 and we use data model acceleration, so the actual disk space usage is quite different from the ingested log volume. From my experience, when upgrading Splunk version, I've sometime seen a substantial change in the ratio of the log volume and log storage, hence, the need to revise the dimentioning tool from time to time...

