Splunk Enterprise

How do I configure an index to manage future events?

I_am_Jeff
Communicator

Our developmeent team is using systems with times set in the future, some as far as two years in the future. I'd like to configure an index for them to use and keep my other indexes realistic, as far as true timestamping goes. Right now I'm running version 4.3.4, with plans on upgrading to version 5 soon.

I came up with something like this, ignoring the home/cold/thawed paths:

maxDataSize = auto
maxHotIdleSecs = 2592000
maxTotalDataSizeMB = 20000
homePath.maxDataSizeMB = 1000
frozenTimePeriodInSecs = 86400

Looking through indexes.conf.specs, the quarantineFutureSecs option looks interesting. How do I search and manage a quarantine bucket? I assume it's not a default option as my main index contains several events more than 30 days in the future.

The volume appears to be small, so I'm not too concerned with retaining the index for two years. (My future stance on that may change.) I'd like the buckets to expire very soon after two years.

I'm open for suggestions and recommendations. Thanks in advance!

Tags (3)
0 Karma
1 Solution

sowings
Splunk Employee
Splunk Employee

The quarantine parameters of an index are intended for precisely this. They're there to detect "anomalous" times, and keep the events out of the main timeline. However, this granularity is provided at the "bucket" level--the substructure of an index. http://docs.splunk.com/Documentation/Splunk/5.0.4/Indexer/HowSplunkstoresindexes

What you're asking for is to groom the existing index so that future events aren't there at all. Sadly, I don't know of a way to do this. The quarantine mechanism would simply put those future events within their own bucket. The bucket would still be part of the index, though, so the overall time span of the index would still include those future events.

Trimming the index to only store two years back however, is easy. That's your frozenTimePeriodInSecs. The default is around 2180 days. You can change it (per-index) to target that two-year retention.

View solution in original post

sowings
Splunk Employee
Splunk Employee

The quarantine parameters of an index are intended for precisely this. They're there to detect "anomalous" times, and keep the events out of the main timeline. However, this granularity is provided at the "bucket" level--the substructure of an index. http://docs.splunk.com/Documentation/Splunk/5.0.4/Indexer/HowSplunkstoresindexes

What you're asking for is to groom the existing index so that future events aren't there at all. Sadly, I don't know of a way to do this. The quarantine mechanism would simply put those future events within their own bucket. The bucket would still be part of the index, though, so the overall time span of the index would still include those future events.

Trimming the index to only store two years back however, is easy. That's your frozenTimePeriodInSecs. The default is around 2180 days. You can change it (per-index) to target that two-year retention.

sowings
Splunk Employee
Splunk Employee

It's a bit of indirection, but you can look in the splunkd.log for events from "databasePartitionPolicy" with the string "quar" in them (sample search: index=_internal source=*splunkd.log component=databasePartitionPolicy quar. This will give you the bucket index and ID# of a bucket originally created as quarantine. You can then look for that in the directory or the output from dbinspect....

The "Fire Brigade" application can help, too.

http://apps.splunk.com/app/1581

lukejadamec
Super Champion

The name of each bucket is made up of the first and last event time and a unique ID. The times are epoch times, so you will need an epoch time converter to read them in standard date time format. Once you convert a few the buckets with events a couple years in advance should be easy to spot without the converter.

0 Karma

I_am_Jeff
Communicator

This sounds promising. What's happened, has happened. I'm okay with that. Perhaps there is a way to determine which buckets are quarantined?

0 Karma

I_am_Jeff
Communicator

Just a quick update.

Users decided to reset their clocks without notifying me. 😞

I've asked them to send to an index named "future" for now. I'll have to set my alarm to periodically check and delete "old" files.

0 Karma

linu1988
Champion

From this i think Summary indexes are more flexible and realistic rather than thinking 2 yrs ahead, splunk may come with some other magic 🙂 Just a comment don't have much idea into this.

0 Karma
Get Updates on the Splunk Community!

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...