Getting Data In

What are best practices for setting up the retention policy for our indexer buckets?

adam_reber
Path Finder

We have about a 3 TB/day ingest rate, spread across about 20 indexes, and we have a 2 to 5 year retention time depending on the index. 4 of these indexes account for 90% of the data, and we are expecting to approximately double or triple our ingest rate eventually. We currently have out maxHotSpanSecs = 86400, maxHotBuckets = 10, and quarantine(Past|Future)Secs = 86400. I have some questions regarding the best practices for setting up buckets and wanted to get feedback on our current settings.

  1. Is it better to have each maxHotSpanSecs for the large indexes be something like 24 hours, or should they be larger, such as 7 days or more? What is the reasoning for this?
  2. How many hot buckets should be set for a large index? Does it make a difference vs. a small index?
  3. What are good settings for quarantine buckets? What could be the negative impacts of setting this to 86400?
  4. How are quarantine buckets rolled over to warm/cold? Is it based on the earliest event in the index?
  5. How is frozenTimePeriodInSecs applied to quarantine buckets?
0 Karma

tweaktubbie
Communicator

http://dev.splunk.com/view/java-sdk/SP-CAAAEJ2 answers part of 3/4/5 I guess, the regular bucket mechanism applies for Quars as well (so it works as with regular hot/warm buckets):

maxHotBuckets A number that indicates
the maximum number of hot buckets that
can exist per index. When this value
is exceeded, Splunk Enterprise rolls
the least recently used (LRU) hot
bucket to warm. Both normal hot
buckets and quarantined hot buckets
count towards this total. This setting
operates independently of
"maxHotIdleSecs", which can also cause
hot buckets to roll.

So consider to raise maxhotbuckets with 1, or way better: prevent or filter out those 'bad timestamp' events with props.conf on your forwarder so Quar is not used anymore, as Quar takes up one slot. Using IndexedTime may be a good first step.

quarantinePastSecs = 86400 seems rather low in comparison to the splunk default of 900 days. When adding a new logfile/source it'll end up automatically in a Quar bucket with all older >1 day events. Or when a weekly batch file is run, and events are imported on specific dates in the previous week, they end up in Quar. Is that really necessary?

I wonder how you could set different bucketsettings for quarantined buckets, as I understood settings apply to both normal/quarantined hot buckets?

For 1 and 2 keep the Indexes,conf in mind - are you mostly interested in a good indexing performance, and/or in having a good performance in dashboards or queries done on the imported data? As usual 'it depends' on your specific needs and load.

Compare it to looking for something in your drawers; the more drawers you have to check the longer it'll take.
But to store everything in your drawers, you might need more drawers, OR as many drawers but larger ones.

Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...