Splunk Search

Opinions on Index Optimization

tgiles
Path Finder

Hi, all.

I'm trying to fix some optimization issues I'm having with Splunk indexes and wanted some input on a proposed index adjustment.

On my indexers, I have two classes of systems. Class A I only need logs for 90 days. Class B, I need to keep logs for a year before they get frozen out.

So, I think of the bucket situation like so:

| Hot = 1 day | Warm = 30 days | Cold = 90 days | Frozen = day 91  | 
| Hot = 1 day | Warm = 90 days | Cold = 1 year  | Frozen = day 367 |

If possible, I'd like to keep the hot buckets down to 1 day chunks. This will make sure that any old data gets cleared out on a regular basis. From what I understand, setting the maxHotIdleSecs to one day will make sure that a particular hot bucket will, at most, only contain one day's worth of data.

Here's a couple of snippets from a future imaginary indexes.conf. Does anyone have any opinions or misgivings on the below configurations? Ever run into an issue with having smaller hot buckets?

[Class_A_System]
homePath   = $SPLUNK_DB/Class_A_System/db
coldPath   = $SPLUNK_DB/Class_A_System/colddb
thawedPath = $SPLUNK_DB/Class_A_System/thaweddb
maxMemMB = 10
# hot 1 day, 1 day's worth of data
maxHotSpanSecs = 86400
maxHotIdleSecs = 86400
# warm 30 days (1 day per bucket)
maxWarmDBCount = 30
# frozen day 91
frozenTimePeriodInSecs = 7776000
maxConcurrentOptimizes = 6
maxHotBuckets = 10
maxDataSize = auto
# locked total disk size
maxTotalDataSizeMB = 102400


[Class_B_System]
homePath   = $SPLUNK_DB/Class_B_System/db
coldPath   = $SPLUNK_DB/Class_B_System/colddb
thawedPath = $SPLUNK_DB/Class_B_System/thaweddb
maxMemMB = 20
# hot 1 day, 1 day's worth of data
maxHotSpanSecs = 86400
maxHotIdleSecs = 86400
# warm 90 days (1 day per bucket)
maxWarmDBCount = 90
# frozen day 367
frozenTimePeriodInSecs = 31708800
maxConcurrentOptimizes = 6
maxHotBuckets = 10
maxDataSize = auto_high_volume

Thanks for any opinions or input you might have.

Tags (1)
0 Karma
1 Solution

tgiles
Path Finder

As an update to anyone who might find this post, the above configuration basically works without a whole lot of fuss. Hot indexes roll over daily, are kept for 30/90 days, then rolled out to cold to be kept for our standard retention time.

I found this site to be extremely helpful to me in visualizing how the indexes roll out.

Good luck!

View solution in original post

bpaul_splunk
Splunk Employee
Splunk Employee

As jreuter stated, do not use maxHotSpanSec=86400. It has been proven this can cause undesired behavior if aligned with the hour or the day.

0 Karma

jreuter_splunk
Splunk Employee
Splunk Employee

Please avoid setting maxHotSpanSecs = 86400. This can cause undesired behavior if the setting aligns with the hour or the day, and can cause a bucket explosion - in some cases the creation of millions of buckets at midnight has been observed, each containing a single event. 86399 or 86401 are both fine, just avoid the exact alignment of 1 hour or 1 day.

tgiles
Path Finder

As an update to anyone who might find this post, the above configuration basically works without a whole lot of fuss. Hot indexes roll over daily, are kept for 30/90 days, then rolled out to cold to be kept for our standard retention time.

I found this site to be extremely helpful to me in visualizing how the indexes roll out.

Good luck!

pajohnston
Explorer

Thanks for that - the description on the third-party site is very clear.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...