The buckets are named:
db_latesttime_earliesttime_idnum
where latesttime
is the time stamp of the latest event in the bucket, earliesttime
is the time stamp of the earliest event in the bucket, and idnum
is an ID number that must be unique within the database across all buckets in the database.
Splunk uses the time numbers to decide whether to look in a bucket at all for events in a given time span. If you narrow these numbers, then events that are in the bucket but outside the new time range will not be returned in search. If you widen them, Splunk will waste time looking in the bucket for events it will never find.
ID numbers matter if you are merging databases or restoring buckets from an archive, and you must ensure that every bucket has a unique ID after any merge. Although ID numbers are generated sequentially by normal Splunk indexing, they do not have to be sequential, nor can you count on them remaining unchanged. A simple way to ensure unique IDs would be to append a distinct digit (or series of digits) to buckets from each specific source, so that buckets from different sources could not possibly match on their last digit.
In general, you should not tune bucket sizes without extensive and deep knowledge of how indexing and searching operates. Most changes from the standards will result in decreased search performance over that data, sometimes enormously decreased, and will extremely rarely result in any noticeable improvements. Use auto
or auto_high_volume
, and accept or copy defaults for most parameters.
The buckets are named:
db_latesttime_earliesttime_idnum
where latesttime
is the time stamp of the latest event in the bucket, earliesttime
is the time stamp of the earliest event in the bucket, and idnum
is an ID number that must be unique within the database across all buckets in the database.
Splunk uses the time numbers to decide whether to look in a bucket at all for events in a given time span. If you narrow these numbers, then events that are in the bucket but outside the new time range will not be returned in search. If you widen them, Splunk will waste time looking in the bucket for events it will never find.
ID numbers matter if you are merging databases or restoring buckets from an archive, and you must ensure that every bucket has a unique ID after any merge. Although ID numbers are generated sequentially by normal Splunk indexing, they do not have to be sequential, nor can you count on them remaining unchanged. A simple way to ensure unique IDs would be to append a distinct digit (or series of digits) to buckets from each specific source, so that buckets from different sources could not possibly match on their last digit.
In general, you should not tune bucket sizes without extensive and deep knowledge of how indexing and searching operates. Most changes from the standards will result in decreased search performance over that data, sometimes enormously decreased, and will extremely rarely result in any noticeable improvements. Use auto
or auto_high_volume
, and accept or copy defaults for most parameters.
And the bucket number should really only contain digits, no literals, and field length is limited.
These are your warm buckets in each index stored in UTC epoch seconds db_#
If your using the default splunk index buckets will be stored in
$SPLUNK_HOME/var/lib/splunk/defaultdb/db
Each index has a number of warm buckets which is specified in your indexes.conf ( Defaults to 300) By default, Splunk sets the bucket size to 10GB for 64bit systems and 750MB on 32bit systems.
For further info on backup, retirement, and archiving best practices see:
http://www.splunk.com/wiki/Deploy:UnderstandingBuckets
http://docs.splunk.com/Documentation/Splunk/5.0/Indexer/Setaretirementandarchivingpolicy
http://www.splunk.com/wiki/Deploy:BestPracticesForBackingUp#How_data_moves_through_Splunk
@Chris_R_ FYSA all 3 of your urls no longer work.