Solved: How does the volume size maxVolumeDataSizeMB apply...

yannK · ‎06-20-2017

I want to use Volumes in indexes.conf to limit the space used by my indexes.

On each index, I see 4 paths : homePath / coldPath / thawedPath / tstatsHomePath
the last one seems to be used for the accelerated datamodels or report accelerations.

How does this works ?

I noticed that they are several paths possible, and some of them (the summary) are already using volumes, that happen to point on the default $SPLUNK_DB path.
- Does a volume considers the other folders that not managed by splunk
- Does a volume considers the other folder in the same location if the use paths (instead of volumes) ?

yannK · ‎06-20-2017

After testing and researching confirm. Here are the conclusions :

Volumes definitions are logical. When measuring the volume size, splunk will only count the size of the indexes (coldPath, homePath, thawedPath or tstatsHomePath) that are defined using this volume.

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary

in this case index1 homePath, coldPath and thawedPath will be considered on the same logical volume.

To enforce the possible volume size limit, only the previous indexes locations will be summed up, and when a bucket has to be frozen, it will be one of the buckets defined on this logical location.

Now the possible situation is when you have : several volumes that are pointing to the same path :

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 
[volume:testvolumeB] 
path = /mount/disk 
maxVolumeDataSizeMB=100 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

[index2] 
homePath = volume:testvolumeB/index2/db 
coldPath = volume:testvolumeB/index2/colddb 
thawedPath = volume:testvolumeB/index2/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary

The 2 volumes testvolumeA and testvolumeB will be both monitored as 2 separate entities. and each of them will only measure the subfolders defined using the volume.

That means that if you do enforce a volume size limit, they both apply there limits separately, to their specific indexes folders.
In my example
testvolumeA will keep it's monitored sub folders under 500MB
testvolumeB will keep it's monitored sub folders under 100MB
This mean that the actual physical path /mount/disk can grow up to 500+100MB = 600MB

I think that this will be the situation if you use a volume pointing to $SPLUNK_DB, as _splunk_summaries are also using it :

[volume:_splunk_summaries] 
path = $SPLUNK_DB 
[volume:summary] 
path = $SPLUNK_DB

So you can estimate your volumes size limits to ensure that the sum of them will not fill your physical disk.
Or you can redefine all your path to use a single volume, and manage the size globally.

Another situation is when you mix a volume with a path.

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

[_internal] 
homePath = $SPLUNK_DB/_internaldb/db 
coldPath = $SPLUNK_DB/_internaldb/colddb 
thawedPath = $SPLUNK_DB/_internaldb/thaweddb

In this case you will see a warning in the splunkd.log for the _internal index homePath and coldPath and thawedPath, as they are not on a volume, but are on the same path that a volume.

example :

06-07-2017 16:27:01.976 -0700 WARN ProcessTracker - (child_6__Fsck) IndexConfig - idx=summary Path homePath='/mount/disk/_internaldb/db' (realpath '/mount/disk/_internaldb/db') is inside volume=testvolumeA (path='/mount/disk', realpath='/mount/disk'), but does not reference that volume. Space used by homePath will not be volume-mananged. Please check indexes.conf for configuration errors.

Why is the tstatsHomePath volumes not throwing errors like the others out of the box ?

However it appears that the warning only exist for homePath and coldPath and thawedPath, it does not exists for tstatsHomePath.
This is why we do not get those errors on a vanilla splunk install, as by default we have the tstatsHomePath using the volume

[volume:_splunk_summaries] 
path = $SPLUNK_DB

View solution in original post

yannK · ‎06-20-2017

After testing and researching confirm. Here are the conclusions :

Volumes definitions are logical. When measuring the volume size, splunk will only count the size of the indexes (coldPath, homePath, thawedPath or tstatsHomePath) that are defined using this volume.

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary

in this case index1 homePath, coldPath and thawedPath will be considered on the same logical volume.

To enforce the possible volume size limit, only the previous indexes locations will be summed up, and when a bucket has to be frozen, it will be one of the buckets defined on this logical location.

Now the possible situation is when you have : several volumes that are pointing to the same path :

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 
[volume:testvolumeB] 
path = /mount/disk 
maxVolumeDataSizeMB=100 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

[index2] 
homePath = volume:testvolumeB/index2/db 
coldPath = volume:testvolumeB/index2/colddb 
thawedPath = volume:testvolumeB/index2/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary

The 2 volumes testvolumeA and testvolumeB will be both monitored as 2 separate entities. and each of them will only measure the subfolders defined using the volume.

That means that if you do enforce a volume size limit, they both apply there limits separately, to their specific indexes folders.
In my example
testvolumeA will keep it's monitored sub folders under 500MB
testvolumeB will keep it's monitored sub folders under 100MB
This mean that the actual physical path /mount/disk can grow up to 500+100MB = 600MB

I think that this will be the situation if you use a volume pointing to $SPLUNK_DB, as _splunk_summaries are also using it :

[volume:_splunk_summaries] 
path = $SPLUNK_DB 
[volume:summary] 
path = $SPLUNK_DB

So you can estimate your volumes size limits to ensure that the sum of them will not fill your physical disk.
Or you can redefine all your path to use a single volume, and manage the size globally.

Another situation is when you mix a volume with a path.

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

[_internal] 
homePath = $SPLUNK_DB/_internaldb/db 
coldPath = $SPLUNK_DB/_internaldb/colddb 
thawedPath = $SPLUNK_DB/_internaldb/thaweddb

In this case you will see a warning in the splunkd.log for the _internal index homePath and coldPath and thawedPath, as they are not on a volume, but are on the same path that a volume.

example :

06-07-2017 16:27:01.976 -0700 WARN ProcessTracker - (child_6__Fsck) IndexConfig - idx=summary Path homePath='/mount/disk/_internaldb/db' (realpath '/mount/disk/_internaldb/db') is inside volume=testvolumeA (path='/mount/disk', realpath='/mount/disk'), but does not reference that volume. Space used by homePath will not be volume-mananged. Please check indexes.conf for configuration errors.

Why is the tstatsHomePath volumes not throwing errors like the others out of the box ?

However it appears that the warning only exist for homePath and coldPath and thawedPath, it does not exists for tstatsHomePath.
This is why we do not get those errors on a vanilla splunk install, as by default we have the tstatsHomePath using the volume

[volume:_splunk_summaries] 
path = $SPLUNK_DB

How does the volume size maxVolumeDataSizeMB apply if you have a mix of volumes and indexes paths ?

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Monitoring AI Agents with Splunk Observability Cloud

[Puzzles] Solve, Learn, Repeat: Tiling

SOK it to Me: Top 3 Benefits of Using Splunk Operator on Kubernetes that’ll Make ...

Join the Conversation

How does the volume size maxVolumeDataSizeMB apply if you have a mix of volumes and indexes paths ?

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Monitoring AI Agents with Splunk Observability Cloud

[Puzzles] Solve, Learn, Repeat: Tiling

SOK it to Me: Top 3 Benefits of Using Splunk Operator on Kubernetes that’ll Make ...