After changing data retention from 2 years to 1 ye...

vanderaj1 · ‎02-02-2016

Recently, I noticed that the disk on one of my Indexers was nearly full. Currently, all event data is going into the main index and we had all the defaults set for bucket rolling behaviors in the main index. The server has been indexing data for at least two years.

We want to retain searchable event data going back 1 year and are not concerned with archiving beyond that, so I changed the archive policy to be more restrictive (changed frozenTimePeriodInSecs to 31556952 in $SPLUNK_HOME/etc/system/local/indexes.conf). I was expecting that this would free up a lot of space by rolling data older than 1 year out and deleting it, but it didn't. I came back on Monday morning after making this change and barely a dent was made in the amount of free space. There are no cold buckets in my main index's coldpath right now, so my change must have had some effect.

I suspect that this Indexer was incorrectly sized when it was first set up and that has led to this disk space issue. We intake ~2.5 GB/day on this Indexer. The disk is 200 GB in total, with 140 GB set up for the main index (which includes hot/warm/cold buckets)

Do I need to add more drive space and increase the size of my main index in order to fix this problem?

Jeremiah · ‎02-02-2016

Yes you probably do need to add more storage. If you are writing ~2.5 GB to the default index, and you plan to retain data for 365 days, your storage requirements would be:

2.5 GB * 365 * 50% (on-disk size) = 456 GB

You can use the Splunk sizing tool to help calculate your disk requirements:

https://splunk-sizing.appspot.com/

If you are running Splunk v6.3 you can look at the DMC to get a better idea of how well data is compressing in your index, the time range represented by your index, and how often your buckets are freezing.

http://docs.splunk.com/Documentation/Splunk/6.3.3/DMC/IndexingIndexesandvolumes

You can also pretty easily verify how data is expiring by searching for

index=_internal sourcetype=splunkd bucketmover will attempt to freeze

These messages indicate that Splunk is deleting data (moving from cold to frozen). The log message will also indicate the reason why the data is rolling off; either due to aging out (frozenTimePeriodInSecs) or because of the storage limit (maxTotalDataSizeMB).

martin_mueller · ‎02-02-2016

Indeed. To clarify, you've restricted the index to 140GB - using the rule-of-thumb 2.5GB*50%, that'd be enough for 112 days. Restricting the retention to a year won't change anything because there is no data that old to delete.

vanderaj1 · ‎02-02-2016

Makes sense! I think it must have been even worse before, because we had the index restricted to 140 GB, but the retention period was the default, which I believe is ~6 years. So we had storage that was sized to retain data for 112 days, but wouldn't delete until the data was 6 years old. I'm surprised that we didn't have this space problem earlier.

martin_mueller · ‎02-02-2016

The oldest bucket gets frozen as soon as you hit one of the two restrictions, size or age. In your case, Splunk deletes buckets as soon as your 140GB are full. Whether you theoretically allow one or six years is irrelevant to that, your primary constraint is space.

vanderaj1 · ‎02-02-2016

So if the storage needs for my main index are 456 GB, would I also set the maxTotalDataSizeMB to 456000? or so you typically set the maxTotalDataSizeMB to a value that is smaller that your total index space needs?

Jeremiah · ‎02-02-2016

This is the slightly tricky part. You need to set both values. Lets stick with what you have in this example, which is a single indexer. You have a volume of data coming into your indexer. That volume, on-disk, takes up roughly 50% of your raw data size, but that value is just an estimate. So you calculate your disk requirements to be about 456 GB. You could set maxTotalDataSizeMB to 456 GB (466944 MB). But you need to consistently check to see if you're approaching that limit, otherwise you risk prematurely deleting data. You also need to make sure you have the storage available (if that means growing the volume, adding more physical disk, etc). You don't want to hit that limit and then realize you need more storage. So, you set the max age for your data to 1 year, that way you're not retaining data you don't need AND you set the max size of the db, so that you don't just fill your disk and stop indexing all together.

In our environment we are very paranoid about expiring data, so we have several monitors in place to let us know when we approach our maxTotalDataSizeMB. First, we use rest API queries to compare the current size of each index with its max size. Secondly, we watch our _internal index for bucket rolls due to any reason other than reaching the frozenTimePeriodInSecs value. Finally, all of our volumes have alerts configured when they start approaching capacity.

vanderaj1 · ‎02-03-2016

Thank you guys! This information you have provided has been extremely helpful to me.

somesoni2 · ‎02-02-2016

Can you provide the settings from indexes.conf for main index?

vanderaj1 · ‎02-02-2016

Sure! Here it is:

[default]
minRawFileSyncSecs = disable
throttleCheckPeriod = 15
rotatePeriodInSecs = 60
compressRawdata = true
quarantinePastSecs = 77760000
quarantineFutureSecs = 2592000
maxTotalDataSizeMB = 140000
maxHotIdleSecs = 0
maxMetaEntries = 1000000
serviceMetaPeriod = 25
syncMeta = true
assureUTF8 = false
frozenTimePeriodInSecs = 31556952
blockSignatureDatabase = _blocksignature
maxWarmDBCount = 300
maxConcurrentOptimizes = 3
coldToFrozenDir =
blockSignSize = 0
maxHotBuckets = 3
enableRealtimeSearch = true
maxHotSpanSecs = 7776000
coldToFrozenScript =
memPoolMB = auto
partialServiceMetaPeriod = 0
suppressBannerList =
rawChunkSizeBytes = 131072
sync = 0
maxRunningProcessGroups = 20
defaultDatabase = main
maxDataSize = auto

somesoni2 · ‎02-02-2016

If you're saying that there is no bucket in the colddb folder (and there is no significant size decreased after you set the frozenTimePeriodSecs to 1 year), mean all your searchable data is stored in hot/warm buckets. Could you check the number of buckets in db folder ? (how many hot and how many warm)

vanderaj1 · ‎02-02-2016

That is true. Just checked, and it looks like there are 2 hot buckets, 71 warm buckers in db folder

After changing data retention from 2 years to 1 year, why did this not free up disk space on my indexer?

Observability Unlocked: Kubernetes Monitoring with Splunk Observability Cloud

Wrapping Up Cybersecurity Awareness Month

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2

Are you a member of the Splunk Community?

After changing data retention from 2 years to 1 year, why did this not free up disk space on my indexer?

Observability Unlocked: Kubernetes Monitoring with Splunk Observability Cloud

Wrapping Up Cybersecurity Awareness Month

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2