Solved: Opinions on Index Optimization

tgiles · ‎06-20-2012

Hi, all.

I'm trying to fix some optimization issues I'm having with Splunk indexes and wanted some input on a proposed index adjustment.

On my indexers, I have two classes of systems. Class A I only need logs for 90 days. Class B, I need to keep logs for a year before they get frozen out.

So, I think of the bucket situation like so:

| Hot = 1 day | Warm = 30 days | Cold = 90 days | Frozen = day 91  | 
| Hot = 1 day | Warm = 90 days | Cold = 1 year  | Frozen = day 367 |

If possible, I'd like to keep the hot buckets down to 1 day chunks. This will make sure that any old data gets cleared out on a regular basis. From what I understand, setting the maxHotIdleSecs to one day will make sure that a particular hot bucket will, at most, only contain one day's worth of data.

Here's a couple of snippets from a future imaginary indexes.conf. Does anyone have any opinions or misgivings on the below configurations? Ever run into an issue with having smaller hot buckets?

[Class_A_System]
homePath   = $SPLUNK_DB/Class_A_System/db
coldPath   = $SPLUNK_DB/Class_A_System/colddb
thawedPath = $SPLUNK_DB/Class_A_System/thaweddb
maxMemMB = 10
# hot 1 day, 1 day's worth of data
maxHotSpanSecs = 86400
maxHotIdleSecs = 86400
# warm 30 days (1 day per bucket)
maxWarmDBCount = 30
# frozen day 91
frozenTimePeriodInSecs = 7776000
maxConcurrentOptimizes = 6
maxHotBuckets = 10
maxDataSize = auto
# locked total disk size
maxTotalDataSizeMB = 102400


[Class_B_System]
homePath   = $SPLUNK_DB/Class_B_System/db
coldPath   = $SPLUNK_DB/Class_B_System/colddb
thawedPath = $SPLUNK_DB/Class_B_System/thaweddb
maxMemMB = 20
# hot 1 day, 1 day's worth of data
maxHotSpanSecs = 86400
maxHotIdleSecs = 86400
# warm 90 days (1 day per bucket)
maxWarmDBCount = 90
# frozen day 367
frozenTimePeriodInSecs = 31708800
maxConcurrentOptimizes = 6
maxHotBuckets = 10
maxDataSize = auto_high_volume

Thanks for any opinions or input you might have.

tgiles · ‎07-19-2012

As an update to anyone who might find this post, the above configuration basically works without a whole lot of fuss. Hot indexes roll over daily, are kept for 30/90 days, then rolled out to cold to be kept for our standard retention time.

I found this site to be extremely helpful to me in visualizing how the indexes roll out.

Good luck!

View solution in original post

bpaul_splunk · ‎02-16-2016

As jreuter stated, do not use maxHotSpanSec=86400. It has been proven this can cause undesired behavior if aligned with the hour or the day.

jreuter_splunk · ‎02-16-2016

Please avoid setting maxHotSpanSecs = 86400. This can cause undesired behavior if the setting aligns with the hour or the day, and can cause a bucket explosion - in some cases the creation of millions of buckets at midnight has been observed, each containing a single event. 86399 or 86401 are both fine, just avoid the exact alignment of 1 hour or 1 day.

tgiles · ‎07-19-2012

As an update to anyone who might find this post, the above configuration basically works without a whole lot of fuss. Hot indexes roll over daily, are kept for 30/90 days, then rolled out to cold to be kept for our standard retention time.

I found this site to be extremely helpful to me in visualizing how the indexes roll out.

Good luck!

pajohnston · ‎07-20-2012

Thanks for that - the description on the third-party site is very clear.

Opinions on Index Optimization

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation