I want to use Volumes in indexes.conf to limit the space used by my indexes.
On each index, I see 4 paths : homePath / coldPath / thawedPath / tstatsHomePath
the last one seems to be used for the accelerated datamodels or report accelerations.
How does this works ?
After testing and researching confirm. Here are the conclusions :
example :
[volume:testvolumeA]
path = /mount/disk
maxVolumeDataSizeMB=500
[index1]
homePath = volume:testvolumeA/index1/db
coldPath = volume:testvolumeA/index1/colddb
thawedPath = volume:testvolumeA/index1/thaweddb
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary
in this case index1 homePath, coldPath and thawedPath will be considered on the same logical volume.
To enforce the possible volume size limit, only the previous indexes locations will be summed up, and when a bucket has to be frozen, it will be one of the buckets defined on this logical location.
example :
[volume:testvolumeA]
path = /mount/disk
maxVolumeDataSizeMB=500
[volume:testvolumeB]
path = /mount/disk
maxVolumeDataSizeMB=100
[index1]
homePath = volume:testvolumeA/index1/db
coldPath = volume:testvolumeA/index1/colddb
thawedPath = volume:testvolumeA/index1/thaweddb
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary
[index2]
homePath = volume:testvolumeB/index2/db
coldPath = volume:testvolumeB/index2/colddb
thawedPath = volume:testvolumeB/index2/thaweddb
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary
The 2 volumes testvolumeA and testvolumeB will be both monitored as 2 separate entities. and each of them will only measure the subfolders defined using the volume.
That means that if you do enforce a volume size limit, they both apply there limits separately, to their specific indexes folders.
In my example
testvolumeA will keep it's monitored sub folders under 500MB
testvolumeB will keep it's monitored sub folders under 100MB
This mean that the actual physical path /mount/disk can grow up to 500+100MB = 600MB
I think that this will be the situation if you use a volume pointing to $SPLUNK_DB, as _splunk_summaries are also using it :
[volume:_splunk_summaries]
path = $SPLUNK_DB
[volume:summary]
path = $SPLUNK_DB
So you can estimate your volumes size limits to ensure that the sum of them will not fill your physical disk.
Or you can redefine all your path to use a single volume, and manage the size globally.
example :
[volume:testvolumeA]
path = /mount/disk
maxVolumeDataSizeMB=500
[index1]
homePath = volume:testvolumeA/index1/db
coldPath = volume:testvolumeA/index1/colddb
thawedPath = volume:testvolumeA/index1/thaweddb
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary
[_internal]
homePath = $SPLUNK_DB/_internaldb/db
coldPath = $SPLUNK_DB/_internaldb/colddb
thawedPath = $SPLUNK_DB/_internaldb/thaweddb
In this case you will see a warning in the splunkd.log for the _internal index homePath and coldPath and thawedPath, as they are not on a volume, but are on the same path that a volume.
example :
06-07-2017 16:27:01.976 -0700 WARN ProcessTracker - (child_6__Fsck) IndexConfig - idx=summary Path homePath='/mount/disk/_internaldb/db' (realpath '/mount/disk/_internaldb/db') is inside volume=testvolumeA (path='/mount/disk', realpath='/mount/disk'), but does not reference that volume. Space used by homePath will not be volume-mananged. Please check indexes.conf for configuration errors.
However it appears that the warning only exist for homePath and coldPath and thawedPath, it does not exists for tstatsHomePath.
This is why we do not get those errors on a vanilla splunk install, as by default we have the tstatsHomePath using the volume
[volume:_splunk_summaries]
path = $SPLUNK_DB
After testing and researching confirm. Here are the conclusions :
example :
[volume:testvolumeA]
path = /mount/disk
maxVolumeDataSizeMB=500
[index1]
homePath = volume:testvolumeA/index1/db
coldPath = volume:testvolumeA/index1/colddb
thawedPath = volume:testvolumeA/index1/thaweddb
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary
in this case index1 homePath, coldPath and thawedPath will be considered on the same logical volume.
To enforce the possible volume size limit, only the previous indexes locations will be summed up, and when a bucket has to be frozen, it will be one of the buckets defined on this logical location.
example :
[volume:testvolumeA]
path = /mount/disk
maxVolumeDataSizeMB=500
[volume:testvolumeB]
path = /mount/disk
maxVolumeDataSizeMB=100
[index1]
homePath = volume:testvolumeA/index1/db
coldPath = volume:testvolumeA/index1/colddb
thawedPath = volume:testvolumeA/index1/thaweddb
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary
[index2]
homePath = volume:testvolumeB/index2/db
coldPath = volume:testvolumeB/index2/colddb
thawedPath = volume:testvolumeB/index2/thaweddb
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary
The 2 volumes testvolumeA and testvolumeB will be both monitored as 2 separate entities. and each of them will only measure the subfolders defined using the volume.
That means that if you do enforce a volume size limit, they both apply there limits separately, to their specific indexes folders.
In my example
testvolumeA will keep it's monitored sub folders under 500MB
testvolumeB will keep it's monitored sub folders under 100MB
This mean that the actual physical path /mount/disk can grow up to 500+100MB = 600MB
I think that this will be the situation if you use a volume pointing to $SPLUNK_DB, as _splunk_summaries are also using it :
[volume:_splunk_summaries]
path = $SPLUNK_DB
[volume:summary]
path = $SPLUNK_DB
So you can estimate your volumes size limits to ensure that the sum of them will not fill your physical disk.
Or you can redefine all your path to use a single volume, and manage the size globally.
example :
[volume:testvolumeA]
path = /mount/disk
maxVolumeDataSizeMB=500
[index1]
homePath = volume:testvolumeA/index1/db
coldPath = volume:testvolumeA/index1/colddb
thawedPath = volume:testvolumeA/index1/thaweddb
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary
[_internal]
homePath = $SPLUNK_DB/_internaldb/db
coldPath = $SPLUNK_DB/_internaldb/colddb
thawedPath = $SPLUNK_DB/_internaldb/thaweddb
In this case you will see a warning in the splunkd.log for the _internal index homePath and coldPath and thawedPath, as they are not on a volume, but are on the same path that a volume.
example :
06-07-2017 16:27:01.976 -0700 WARN ProcessTracker - (child_6__Fsck) IndexConfig - idx=summary Path homePath='/mount/disk/_internaldb/db' (realpath '/mount/disk/_internaldb/db') is inside volume=testvolumeA (path='/mount/disk', realpath='/mount/disk'), but does not reference that volume. Space used by homePath will not be volume-mananged. Please check indexes.conf for configuration errors.
However it appears that the warning only exist for homePath and coldPath and thawedPath, it does not exists for tstatsHomePath.
This is why we do not get those errors on a vanilla splunk install, as by default we have the tstatsHomePath using the volume
[volume:_splunk_summaries]
path = $SPLUNK_DB