Getting Data In

Why did setting "maxDataSize = auto_high_volume" for an index delete all events older than a year?

Ricapar
Communicator

I originally had this in my indexes.conf file:

[myindex]
homePath = $SPLUNK_DB/myindex/db
coldPath = $SPLUNK_DB/myindex/colddb
frozenTimePeriodInSecs = 94608000
thawedPath = $SPLUNK_DB/myindex/thaweddb
disabled = false
repFactor = auto

The retention time we wanted on this index is 3 years (see the frozenTimePeriodInSecs line). After that, data gets frozen (deleted).

The index gets a considerable amount of data, and currently has a few thousand buckets. It was suggested that setting the maxDataSize would more efficiently manage bucket sizes.

Per the documentation for indexes.conf:

maxDataSize = <positive integer>|auto|auto_high_volume
    * The maximum size in MB for a hot DB to reach before a roll to warm is triggered.

The following line was added to the bottom of the stanza:

maxDataSize = auto_high_volume

After setting this (and doing a rolling cluster restart via the cluster master), I noticed that the index it was set on had rolled out all of the data it had older than a year.

What caused this? Based on the documentation on indexes.conf, that setting should have had no effect on when Splunk rolls cold to frozen.

0 Karma

yannK
Splunk Employee
Splunk Employee

The setting maxDataSize has nothing to do with the retention of old data, but only with the maximum size of a hot bucket. (therefore the rotation of hot to warm)

Make sure that another setting was not sneakily deployed, like frozenTimePeriodInSecs, or maxTotalDataSize, or any volumes limits, or even repFactor if it requires more copies to be made and reduce your actual shared storage space for that index.

To investigate :
- run a "splunk cmd btool indexes list" on your indexers to check
- and look on your splunkd.log for any bucket freezing events on the indexers, they mention the reasons why the buckets were deleted/frozen (size or time constrains)

Ricapar
Communicator

It's unfortunately too far past to check the _internal index for any clues, as that data has already gone past its retention (and we don't keep frozen copies of it).

After we discovered the issue we immediately undid the change. However, this is how the btool output looks like for that index as of now:

[myindex]
assureUTF8 = false
blockSignSize = 0
blockSignatureDatabase = _blocksignature
bucketRebuildMemoryHint = auto
coldPath = $SPLUNK_DB/myindex/colddb
coldPath.maxDataSizeMB = 0
coldToFrozenDir =
coldToFrozenScript =
compressRawdata = true
defaultDatabase = main
disabled = false
enableOnlineBucketRepair = true
enableRealtimeSearch = true
frozenTimePeriodInSecs = 94608000
homePath = $SPLUNK_DB/myindex/db
homePath.maxDataSizeMB = 0
hotBucketTimeRefreshInterval = 10
indexThreads = auto
maxBloomBackfillBucketAge = 30d
maxBucketSizeCacheEntries = 0
maxConcurrentOptimizes = 6
maxDataSize = auto
maxHotBuckets = 3
maxHotIdleSecs = 0
maxHotSpanSecs = 7776000
maxMemMB = 5
maxMetaEntries = 1000000
maxRunningProcessGroups = 12
maxRunningProcessGroupsLowPriority = 1
maxTimeUnreplicatedNoAcks = 300
maxTimeUnreplicatedWithAcks = 60
maxTotalDataSizeMB = 4294967295
maxWarmDBCount = 300
memPoolMB = auto
minRawFileSyncSecs = disable
minStreamGroupQueueSize = 2000
partialServiceMetaPeriod = 0
processTrackerServiceInterval = 1
quarantineFutureSecs = 2592000
quarantinePastSecs = 77760000
rawChunkSizeBytes = 131072
repFactor = auto
rotatePeriodInSecs = 60
serviceMetaPeriod = 25
serviceOnlyAsNeeded = true
serviceSubtaskTimingPeriod = 30
streamingTargetTsidxSyncPeriodMsec = 5000
suppressBannerList =
sync = 0
syncMeta = true
thawedPath = $SPLUNK_DB/myindex/thaweddb
throttleCheckPeriod = 15
tstatsHomePath = volume:_splunk_summaries/$_index_name/datamodel_summary
warmToColdScript =
0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...