Splunk Search

Opinions on Index Optimization

tgiles
Path Finder

Hi, all.

I'm trying to fix some optimization issues I'm having with Splunk indexes and wanted some input on a proposed index adjustment.

On my indexers, I have two classes of systems. Class A I only need logs for 90 days. Class B, I need to keep logs for a year before they get frozen out.

So, I think of the bucket situation like so:

| Hot = 1 day | Warm = 30 days | Cold = 90 days | Frozen = day 91  | 
| Hot = 1 day | Warm = 90 days | Cold = 1 year  | Frozen = day 367 |

If possible, I'd like to keep the hot buckets down to 1 day chunks. This will make sure that any old data gets cleared out on a regular basis. From what I understand, setting the maxHotIdleSecs to one day will make sure that a particular hot bucket will, at most, only contain one day's worth of data.

Here's a couple of snippets from a future imaginary indexes.conf. Does anyone have any opinions or misgivings on the below configurations? Ever run into an issue with having smaller hot buckets?

[Class_A_System]
homePath   = $SPLUNK_DB/Class_A_System/db
coldPath   = $SPLUNK_DB/Class_A_System/colddb
thawedPath = $SPLUNK_DB/Class_A_System/thaweddb
maxMemMB = 10
# hot 1 day, 1 day's worth of data
maxHotSpanSecs = 86400
maxHotIdleSecs = 86400
# warm 30 days (1 day per bucket)
maxWarmDBCount = 30
# frozen day 91
frozenTimePeriodInSecs = 7776000
maxConcurrentOptimizes = 6
maxHotBuckets = 10
maxDataSize = auto
# locked total disk size
maxTotalDataSizeMB = 102400


[Class_B_System]
homePath   = $SPLUNK_DB/Class_B_System/db
coldPath   = $SPLUNK_DB/Class_B_System/colddb
thawedPath = $SPLUNK_DB/Class_B_System/thaweddb
maxMemMB = 20
# hot 1 day, 1 day's worth of data
maxHotSpanSecs = 86400
maxHotIdleSecs = 86400
# warm 90 days (1 day per bucket)
maxWarmDBCount = 90
# frozen day 367
frozenTimePeriodInSecs = 31708800
maxConcurrentOptimizes = 6
maxHotBuckets = 10
maxDataSize = auto_high_volume

Thanks for any opinions or input you might have.

Tags (1)
0 Karma
1 Solution

tgiles
Path Finder

As an update to anyone who might find this post, the above configuration basically works without a whole lot of fuss. Hot indexes roll over daily, are kept for 30/90 days, then rolled out to cold to be kept for our standard retention time.

I found this site to be extremely helpful to me in visualizing how the indexes roll out.

Good luck!

View solution in original post

bpaul_splunk
Splunk Employee
Splunk Employee

As jreuter stated, do not use maxHotSpanSec=86400. It has been proven this can cause undesired behavior if aligned with the hour or the day.

0 Karma

jreuter_splunk
Splunk Employee
Splunk Employee

Please avoid setting maxHotSpanSecs = 86400. This can cause undesired behavior if the setting aligns with the hour or the day, and can cause a bucket explosion - in some cases the creation of millions of buckets at midnight has been observed, each containing a single event. 86399 or 86401 are both fine, just avoid the exact alignment of 1 hour or 1 day.

tgiles
Path Finder

As an update to anyone who might find this post, the above configuration basically works without a whole lot of fuss. Hot indexes roll over daily, are kept for 30/90 days, then rolled out to cold to be kept for our standard retention time.

I found this site to be extremely helpful to me in visualizing how the indexes roll out.

Good luck!

pajohnston
Explorer

Thanks for that - the description on the third-party site is very clear.

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to December Tech Talks, Office Hours, and Webinars!

❄️ Celebrate the season with our December lineup of Community Office Hours, Tech Talks, and Webinars! ...

Splunk and Fraud

Watch Now!Watch an insightful webinar where we delve into the innovative approaches to solving fraud using the ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...