I'm looking for a report that shows the currenct size of my Splunk Indexer and how much of that data is over 12 months. Keeping in mind that Splunk compresses this data so I just need to know how much data out of the size I currently hold on my Indexer or Disk space is 12+ months old.
Can anyone assist me in figuring this out?
That is an interesting question. It's fairly easy to give you an accurate answer that will not be at all helpful. For instance, you could use
tstats to find out how much data is currently indexed that is at least that old, then multiply by a SWAG of about 50% to account for compression (10%) and indexing (40%).
However, we have to assume that your business case is that you plan to DO something with the answer, and in that case, giving you a useful answer is more problematic.
Data in splunk is stored in chunks. The way that buckets work in splunk is that they roll off when the YOUNGEST data in a particular bucket is older than the date in question.
So, one answer we might be able to tell you how to get is, "how much space is used by files that have data in them that is no younger than 12 months?"
You could probably just poll your file system in your cold bucket folder for that last update date, which should give you the latest time that data was added to that chunk of data or index. (But I'm checking with wiser heads on whether that is true.)
Per @martin_mueller, start here
| dbinspect index=* | eval Month=strftime(endEpoch,"%Y-%m) | stats sum(sizeOnDiskMB) by index Month
There is a startEpoch and endEpoch dates for each record, I've set the date for the bucket as/of the month of the last contained data.
If you are looking at individual server disk space, then you will want to include
splunk_server in the
You can also run it with index=_* for the internal indexes.