Archive

Determine how much data is being stored daily vs how much is aging out?

mmcarty
New Member

Hello community,

First of all, thank you very much for your time for reading my following question,
I am a Splunk newbie, an instruction has been given to us to, "Determine how much data is being stored daily vs how much is aging out"

  • We have a clustered Indexers environment.
  • we’re approaching our disk capacity

We are aging out data reaching its retention period so while we’re indexing new data, we are throwing away old but since we have added expanded datasources over the last year, we are most likely ingesting more than we’re aging out.

my question is:
How to determine how much data is being stored daily vs how much is aging out?
This should give us an idea of how much runway we have.

Thank you very much!

Tags (3)
0 Karma

adonio
SplunkTrust
SplunkTrust

hello there,

to check how much data is being committed to disk daily, use the time picker and search the following:

| dbinspect index=*
| stats sum(sizeOnDiskMB) as daily_mb_to_disk 

if you dont roll to frozen, it can be a little tricky to track the sum of data in buckets that are rolling out.
you can try and calculate your disk size at the beginning of a day and at the end of a day and use the total size you discovered earlier to see how much data rolled out:
example: total disk size at 4/4 00:00:00 is 500GB total data used at that time is 400GB
total data committed on 4/4 is 50GB.
when checking the total disk at 4/5 00:00:00 is 425 -> you rolled out 25GB

hope it helps

p.s. try and look for the Fire Brigade app, it might have better insights, also, try and use data in the _introspection index.

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!