Our developmeent team is using systems with times set in the future, some as far as two years in the future. I'd like to configure an index for them to use and keep my other indexes realistic, as far as true timestamping goes. Right now I'm running version 4.3.4, with plans on upgrading to version 5 soon.
I came up with something like this, ignoring the home/cold/thawed paths:
maxDataSize = auto maxHotIdleSecs = 2592000 maxTotalDataSizeMB = 20000 homePath.maxDataSizeMB = 1000 frozenTimePeriodInSecs = 86400
Looking through indexes.conf.specs, the quarantineFutureSecs option looks interesting. How do I search and manage a quarantine bucket? I assume it's not a default option as my main index contains several events more than 30 days in the future.
The volume appears to be small, so I'm not too concerned with retaining the index for two years. (My future stance on that may change.) I'd like the buckets to expire very soon after two years.
I'm open for suggestions and recommendations. Thanks in advance!
From this i think Summary indexes are more flexible and realistic rather than thinking 2 yrs ahead, splunk may come with some other magic 🙂 Just a comment don't have much idea into this.
Just a quick update.
Users decided to reset their clocks without notifying me. 😞
I've asked them to send to an index named "future" for now. I'll have to set my alarm to periodically check and delete "old" files.
The quarantine parameters of an index are intended for precisely this. They're there to detect "anomalous" times, and keep the events out of the main timeline. However, this granularity is provided at the "bucket" level--the substructure of an index. http://docs.splunk.com/Documentation/Splunk/5.0.4/Indexer/HowSplunkstoresindexes
What you're asking for is to groom the existing index so that future events aren't there at all. Sadly, I don't know of a way to do this. The quarantine mechanism would simply put those future events within their own bucket. The bucket would still be part of the index, though, so the overall time span of the index would still include those future events.
Trimming the index to only store two years back however, is easy. That's your
frozenTimePeriodInSecs. The default is around 2180 days. You can change it (per-index) to target that two-year retention.
This sounds promising. What's happened, has happened. I'm okay with that. Perhaps there is a way to determine which buckets are quarantined?
The name of each bucket is made up of the first and last event time and a unique ID. The times are epoch times, so you will need an epoch time converter to read them in standard date time format. Once you convert a few the buckets with events a couple years in advance should be easy to spot without the converter.
It's a bit of indirection, but you can look in the splunkd.log for events from "databasePartitionPolicy" with the string "quar" in them (sample search:
index=_internal source=*splunkd.log component=databasePartitionPolicy quar. This will give you the bucket index and ID# of a bucket originally created as quarantine. You can then look for that in the directory or the output from dbinspect....
The "Fire Brigade" application can help, too.