Getting Data In
Highlighted

Why did setting "maxDataSize = auto_high_volume" for an index delete all events older than a year?

Communicator

I originally had this in my indexes.conf file:

[myindex]
homePath = $SPLUNK_DB/myindex/db
coldPath = $SPLUNK_DB/myindex/colddb
frozenTimePeriodInSecs = 94608000
thawedPath = $SPLUNK_DB/myindex/thaweddb
disabled = false
repFactor = auto

The retention time we wanted on this index is 3 years (see the frozenTimePeriodInSecs line). After that, data gets frozen (deleted).

The index gets a considerable amount of data, and currently has a few thousand buckets. It was suggested that setting the maxDataSize would more efficiently manage bucket sizes.

Per the documentation for indexes.conf:

maxDataSize = <positive integer>|auto|auto_high_volume
    * The maximum size in MB for a hot DB to reach before a roll to warm is triggered.

The following line was added to the bottom of the stanza:

maxDataSize = auto_high_volume

After setting this (and doing a rolling cluster restart via the cluster master), I noticed that the index it was set on had rolled out all of the data it had older than a year.

What caused this? Based on the documentation on indexes.conf, that setting should have had no effect on when Splunk rolls cold to frozen.

0 Karma
Highlighted

Re: Why did setting "maxDataSize = auto_high_volume" for an index delete all events older than a year?

Ultra Champion

The setting maxDataSize has nothing to do with the retention of old data, but only with the maximum size of a hot bucket. (therefore the rotation of hot to warm)

Make sure that another setting was not sneakily deployed, like frozenTimePeriodInSecs, or maxTotalDataSize, or any volumes limits, or even repFactor if it requires more copies to be made and reduce your actual shared storage space for that index.

To investigate :
- run a "splunk cmd btool indexes list" on your indexers to check
- and look on your splunkd.log for any bucket freezing events on the indexers, they mention the reasons why the buckets were deleted/frozen (size or time constrains)

Highlighted

Re: Why did setting "maxDataSize = auto_high_volume" for an index delete all events older than a year?

Communicator

It's unfortunately too far past to check the _internal index for any clues, as that data has already gone past its retention (and we don't keep frozen copies of it).

After we discovered the issue we immediately undid the change. However, this is how the btool output looks like for that index as of now:

[myindex]
assureUTF8 = false
blockSignSize = 0
blockSignatureDatabase = _blocksignature
bucketRebuildMemoryHint = auto
coldPath = $SPLUNK_DB/myindex/colddb
coldPath.maxDataSizeMB = 0
coldToFrozenDir =
coldToFrozenScript =
compressRawdata = true
defaultDatabase = main
disabled = false
enableOnlineBucketRepair = true
enableRealtimeSearch = true
frozenTimePeriodInSecs = 94608000
homePath = $SPLUNK_DB/myindex/db
homePath.maxDataSizeMB = 0
hotBucketTimeRefreshInterval = 10
indexThreads = auto
maxBloomBackfillBucketAge = 30d
maxBucketSizeCacheEntries = 0
maxConcurrentOptimizes = 6
maxDataSize = auto
maxHotBuckets = 3
maxHotIdleSecs = 0
maxHotSpanSecs = 7776000
maxMemMB = 5
maxMetaEntries = 1000000
maxRunningProcessGroups = 12
maxRunningProcessGroupsLowPriority = 1
maxTimeUnreplicatedNoAcks = 300
maxTimeUnreplicatedWithAcks = 60
maxTotalDataSizeMB = 4294967295
maxWarmDBCount = 300
memPoolMB = auto
minRawFileSyncSecs = disable
minStreamGroupQueueSize = 2000
partialServiceMetaPeriod = 0
processTrackerServiceInterval = 1
quarantineFutureSecs = 2592000
quarantinePastSecs = 77760000
rawChunkSizeBytes = 131072
repFactor = auto
rotatePeriodInSecs = 60
serviceMetaPeriod = 25
serviceOnlyAsNeeded = true
serviceSubtaskTimingPeriod = 30
streamingTargetTsidxSyncPeriodMsec = 5000
suppressBannerList =
sync = 0
syncMeta = true
thawedPath = $SPLUNK_DB/myindex/thaweddb
throttleCheckPeriod = 15
tstatsHomePath = volume:_splunk_summaries/$_index_name/datamodel_summary
warmToColdScript =
0 Karma
Speak Up for Splunk Careers!

We want to better understand the impact Splunk experience and expertise has has on individuals' careers, and help highlight the growing demand for Splunk skills.