repFactor = auto homePath = volume:home/indexname/db coldPath = volume:SAN/indexname/colddb thawedPath = $SPLUNK_THAW_VOL/indexname/thaweddb # the max settings are copied from main's default max settings maxMemMB = 20 maxConcurrentOptimizes = 6 maxHotIdleSecs = 86400 maxHotBuckets = 10 maxDataSize = auto_high_volume homePath.maxDataSizeMB = 409600 coldPath.maxDataSizeMB = 1536000 maxTotalDataSizeMB = 1945600 # maxTotalDataSizeMB = ? # keep logs for 90 days frozenTimePeriodInSecs = 7776000
The logs seem to be rolling from cold to frozen at around 60 days, all but one or two source types (so when I search back to between 60 and 90 days I only see one or two sourcetypes when there should exist over 20).
The coldpath limit isn't even close to being hit on this index. I implemented this index configuration at the beginning of the year so it should be keeping the data for 90 day periods, yet it's throwing them out before. Are there other areas that can trigger a rolling from cold to frozen? We have plenty of space on the drives as well.
In regard to your settings:
homePath = volume:home/indexname/db coldPath = volume:SAN/indexname/colddb thawedPath = $SPLUNK_THAW_VOL/indexname/thaweddb
From the indexes.conf documentation:
thawedPath must be specified, and cannot use volume: syntax
choose a location convenient for reconstitition from archive goals
For many sites, this may never be used.
From your settings:
# the max settings are copied from main's default max settings maxMemMB = 20
From the indexes.conf documentation this defaults to 5 in the newest version:
IMPORTANT: Calculate this number
carefully. splunkd will crash if you
set this number higher than the
amount of memory available. The
default is recommended for all
homePath.maxDataSizeMB = 409600 coldPath.maxDataSizeMB = 1536000 maxTotalDataSizeMB = 1945600 frozenTimePeriodInSecs = 7776000
All of these can effect the cold to frozen decision, the once either the homePath size limit is reached or maxWarmDBCount is reached or you reach the hot volume limit (in your example volume:home) you will roll to cold.
From there either 1536000MB can be reached or frozen time period in seconds or cold volume (volume:SAN) in your example, can result buckets rolling to frozen.
Note this all applies per indexer.
solarboyz1 provided some example searches for this. Also personally I wouldn't use maxHotIdleSecs or tweak your maxHotBuckets settings unless you know what you are doing.
Finally, auto_high_volume is designed for higher volume indexes FYI
FYI within the Alerts for Splunk Admins app I have two alerts that relate to this scenario:
IndexerLevel - Buckets are been frozen due to index sizing, effectively:
index=_internal `indexerhosts` sourcetype=splunkd "BucketMover - will attempt to freeze" NOT "because frozenTimePeriodInSecs="
There are multiple factors to why a bucket rolls.
Run the following search:
index=_* component=BucketMover "will attempt to freeze"
You should see event similar to:
07-24-2014 01:30:51.609 +0200 INFO BucketMover - will attempt to freeze: candidate='/opt/SP/apps/splunk/splunk-6.0.1/var/lib/splunk/rest/db/db_1392823223_1392819715_1' **because frozenTimePeriodInSecs=2419200 exceeds difference between now=1406158251 and latest=1392823223**
These events show the reason the bucket were rolled. That would help pinpoint the root cause.
Thank you, this is definitely helpful! Although, one thing that still isn't adding up, is that if I search just for the index in question, the only reason it ever gives is that the frozenTimePeriodInSecs=7776000 is exceeded by the difference between now=number and latest=number
I don't suppose there is any other reason it might be getting evicted?
so here is something I just thought about, it seems to be constantly saying it is evicting due to the frozenTimePeriod, yet none of the data seems to ever come close to that period, but I did find a sourcetype every now and then throws in super old data. Could it be that just the one bit of old data causes the entire bucket to get evicted even if the majority of the data in it is not past the frozen time period?
@briancronrath yes, but only if you rolling based on an index / volume size limit rather than the time based limit (frozenTimePeriodInSecs)
Size-based rolling is oldest bucket first which means the oldest piece of data within a bucket determines when to roll.
frozenTimePeriodInSecs would ensure all data was past the required date (even the newest data in the bucket) before rolling the entire bucket.
So, in the frozen bucket events:
07-24-2014 01:30:51.609 +0200 INFO BucketMover - will attempt to freeze: candidate='/opt/SP/apps/splunk/splunk-6.0.1/var/lib/splunk/rest/db/db_1392823223_1392819715_1'because frozenTimePeriodInSecs=2419200 exceeds difference between now=1406158251 and latest=1392823223'
The latest time should be the timestamp of the earliest event in the bucket. Are you receiving freeze events where now - latest < 7776000 ?