This is my indexes.conf configuration
[volume:hot_warm]
path = /store/hot_warm
maxVolumeDataSizeMB = 1450000
[volume:cold]
path = /store/cold
maxVolumeDataSizeMB = 9400000
[pan]
homePath = volume:hot_warm/pan/db
coldPath = volume:cold/pan/colddb
tstatsHomePath = volume:hot_warm/pan/datamodel_summary
thawedPath = /restore/pan/thaweddb
coldToFrozenDir = /store/cold/archive/pan
maxDataSize = auto
frozenTimePeriodInSecs = 31536000 -> When data is moved from cold to frozen. That is after 1 year.
maxTotalDataSizeMB = 20000000 -> Maximum size of an index. That is 20 TB.
enableTsidxReduction = true -> Reduces size of TSIDX files. Results in reduced bucket size but are slower while searching.
timePeriodInSecBeforeTsidxReduction = 2592000 - > After 30 Days, TSIDX gets enabled.
In this case, will the maxTotalDataSizeMB gets precedence or maxVolumeDataSizeMB gets precendece ?
As per my understanding, maxVolumeDataSizeMB: total size of databases this directory can hold. In other words, it can store hot_warm data till it reaches 1.45 TB and the databases will then roll into cold (which can hold 9.4 TB). Is it right ?
Would the pan index will be stored only for 20TB across all (hot_warm and cold)?
Also, keeping the maxDataSize as auto, I believe there will be 300 hot_warm buckets of 750 MB each in the hot_warm colume. Is it right ?
And if i make it to auto_high_volume, would there be 300 hot_warm buckets of 10GB each ? If that is the case would it be keeping a lot of hot_warm data as compared to how it is being kept now ?
Is the maxWarmDBCount 300 by default ?
Let's go one by one:
1: Will the maxTotalDataSizeMB
or maxVolumeDataSizeMB
gets precedence ?
A: maxTotalDataSizeMB
is equal to maxVolumeDataSizeMB
; in other words, whichever you hit first will be enforced.
2a: As per my understanding, maxVolumeDataSizeMB
is the total size of databases this directory can hold.
A: No, it is the total of EVERYTHING, not just "databases"; if you dump a 20TB tarfile in there, then you have no room for any "database" buckets.
2b: In other words, it can store hot_warm data till it reaches 1.45 TB and the databases will then roll into cold (which can hold 9.4 TB). Is it right ?
A: Yes, assuming nothing else non-Splunk is also using that directory (it should not be).
3: Would the pan index will be stored only for 20TB across all (hot_warm and cold)?
A: Yes, exactly.
4a: Keeping the maxDataSize
as auto
, I believe there will be 300 hot_warm buckets of 750 MB each in the hot_warm volume. Is it right?
A: The auto
setting sets the size to 750MB
for bucket size, which in your case means 1450000/750 ~ 1933 buckets (almost all of these will be warm
).
4b: And if i make it to auto_high_volume
, would there be 300 hot_warm buckets of 10GB each ? If that is the case would it be keeping a lot of hot_warm data as compared to how it is being kept now ?
A: The auto_high_volume
sets the size to 10GB
on 64-bit, or 1GB
on 32-bit systems, which in your case (assuming 64-bit) means 1450000/10240 ~ 141 buckets.
5: Is the maxWarmDBCount
300 by default ?
A: No, the default is 200
.
NOTE: You are using MB*1000*1000*
for TB which is not correct (it is MB*1024*1024
).
Also note, The maximum size of your warm buckets may slightly exceed maxDataSize
, due to post-processing and timing issues with the rolling policy.
Also note, some settings may vary from version to version.
Let's go one by one:
1: Will the maxTotalDataSizeMB
or maxVolumeDataSizeMB
gets precedence ?
A: maxTotalDataSizeMB
is equal to maxVolumeDataSizeMB
; in other words, whichever you hit first will be enforced.
2a: As per my understanding, maxVolumeDataSizeMB
is the total size of databases this directory can hold.
A: No, it is the total of EVERYTHING, not just "databases"; if you dump a 20TB tarfile in there, then you have no room for any "database" buckets.
2b: In other words, it can store hot_warm data till it reaches 1.45 TB and the databases will then roll into cold (which can hold 9.4 TB). Is it right ?
A: Yes, assuming nothing else non-Splunk is also using that directory (it should not be).
3: Would the pan index will be stored only for 20TB across all (hot_warm and cold)?
A: Yes, exactly.
4a: Keeping the maxDataSize
as auto
, I believe there will be 300 hot_warm buckets of 750 MB each in the hot_warm volume. Is it right?
A: The auto
setting sets the size to 750MB
for bucket size, which in your case means 1450000/750 ~ 1933 buckets (almost all of these will be warm
).
4b: And if i make it to auto_high_volume
, would there be 300 hot_warm buckets of 10GB each ? If that is the case would it be keeping a lot of hot_warm data as compared to how it is being kept now ?
A: The auto_high_volume
sets the size to 10GB
on 64-bit, or 1GB
on 32-bit systems, which in your case (assuming 64-bit) means 1450000/10240 ~ 141 buckets.
5: Is the maxWarmDBCount
300 by default ?
A: No, the default is 200
.
NOTE: You are using MB*1000*1000*
for TB which is not correct (it is MB*1024*1024
).
Also note, The maximum size of your warm buckets may slightly exceed maxDataSize
, due to post-processing and timing issues with the rolling policy.
Also note, some settings may vary from version to version.
@woodcock
You meant to say that maxVolumeDataSizeMB has precedence over (the potentially multiple, added together) maxTotalDataSizeMB ?
It is kind of both so I reworded my answer.