Deployment Architecture

what is a hot_quar_v1_ directory, vs standard hot_v1_?

Contributor

According to documentation, and generally speaking in action, hot buckets are named

  hot_v1_<id> 

... but I am noticing some of the hot directories labeled as

  hot_quar_v1_<id> 

What is the difference, and why?

1 Solution

Splunk Employee
Splunk Employee

The difference is that 'hot_v1_<id>' is a normal hot bucket, where data is inserted based on timestamp. The 'hot_quar_v1_<id>' is a quarantine bucket. These buckets are meant to catch data that is either older than specified in indexes.conf, or too far in the future than allowed by indexes.conf. The data is inserted into a quarantine bucket as a means of keeping the the index from being polluted by old and/or future data.

http://docs.splunk.com/Documentation/Splunk/latest/admin/indexesconf

quarantinePastSecs = <positive integer>
    * Events with timestamp of quarantinePastSecs older than "now" will be
      dropped into quarantine bucket.
    * Defaults to 77760000 (900 days).
    * This is a mechanism to prevent the main hot buckets from being polluted with
      fringe events.

quarantineFutureSecs = <positive integer>
    * Events with timestamp of quarantineFutureSecs newer than "now" will be
      dropped into quarantine bucket.
    * Defaults to 2592000 (30 days).
    * This is a mechanism to prevent main hot buckets from being polluted with
      fringe events.

View solution in original post

Splunk Employee
Splunk Employee

The difference is that 'hot_v1_<id>' is a normal hot bucket, where data is inserted based on timestamp. The 'hot_quar_v1_<id>' is a quarantine bucket. These buckets are meant to catch data that is either older than specified in indexes.conf, or too far in the future than allowed by indexes.conf. The data is inserted into a quarantine bucket as a means of keeping the the index from being polluted by old and/or future data.

http://docs.splunk.com/Documentation/Splunk/latest/admin/indexesconf

quarantinePastSecs = <positive integer>
    * Events with timestamp of quarantinePastSecs older than "now" will be
      dropped into quarantine bucket.
    * Defaults to 77760000 (900 days).
    * This is a mechanism to prevent the main hot buckets from being polluted with
      fringe events.

quarantineFutureSecs = <positive integer>
    * Events with timestamp of quarantineFutureSecs newer than "now" will be
      dropped into quarantine bucket.
    * Defaults to 2592000 (30 days).
    * This is a mechanism to prevent main hot buckets from being polluted with
      fringe events.

View solution in original post

Path Finder

I found cases where events older than quarantinePastSecs slipping into normal hot buckets. When is this possible ? Is this a bug and how to prevent this ?

0 Karma

Splunk Employee
Splunk Employee

That is expected behavior and in this case, the value needs to be set appropriately to include those far apart past timed events falling within quarantinePastSecs.

0 Karma

Path Finder

Adding to the comments,
Is there any configuration property that forces Splunk indexer to include results from quarantine buckets for search ?

0 Karma

Communicator

I'm curious what the retention of events in quarantine buckets is. I can find back wrong-indexed events due to US/European timestamp settings. There appears to be only one quar bucket in an index if there's one. I understand you want to keep other buckets clean and tidy, but on the other hand we can't afford to miss events eventhough they're stored with a wrong timestamp in the past. Probably it'll have the same max settings as the hot buckets, but when restarting (and a bucket roll is often the case) the quar seems to exist with all events in it.

Contributor

I figured as much, documentation just didn't clearly spell it out anywhere I could find. Any quick way to isolate that data? Obviously scanning for data outside of the accepted date range, but looking for something more straight forward.

According to the index metadata, my "latest" event is in the year 0468, which I'm having trouble turning into an actual date in Splunk...

earleist date is Dec 31, 1969 7:00:00 PM (00:00:00 01 Jan 1970 UTC)- easy enough to figure that one out...

0 Karma