Getting Data In

maxVolumeDataSizeMB Setting Precedence

foresterd
Loves-to-Learn

For the two indexes.conf volume settings below - would one take precedence over the other if they use the same path?

[volume:hotwarm]
path = /splunk-data/hot-warm
# ~5.8 TB
maxVolumeDataSizeMB = 5800000


[volume:_splunk_summaries]
path = /splunk-data/hot-warm
# ~ 100GB
maxVolumeDataSizeMB = 100000


I am observing that the volume setting with the lower value appears to be taking precidence over the volume with the higher value. Every index that is configured to use the "hotwarm" volume looks to be getting capped at 100GB when looking at my MC.(indexing -> indexes and volumes -> volume detail:instance)

Below is the sample data I see within the MC for one indexer for hotwarm.

Volume Usage

Volume       Volume Usage (GB)    Volume Capacity (GB)    Volume Path
hotwarm            948.96                            5664.06                        /splunk-data/hot-warm

Index Directories Using This Volume

Index:Directory                         Disk Usage (GB)        Data Age (Days)
paloalto:home                             95.53                                   11
asa:home                                       91.57                                  15
wsa:home                                      87.06                                   96
extrahop:home                            86.46                                 157
firepower:home                           85.50                                 188
adaudit:home                               80.08                                  539
phantom_container:home     78.44                                   497
sep:home                                       75.07                                   394
waf:home                                       64.47                                  489

For the sample above, the volume usage for hotwarm never goes past 1TB so it leads me to believe the lower setting does take precedence because the individual indexes don't go past 100GB each.

After a ton of digging (btool, SPL, linux commands, etc...) the same setting keeps on showing up as the possible culprit (maxVolumeDataSizeMB = 100000).

If this behavior happens to be true then I assume I'd need to create another volume for my summaries? If not true then I am truly stumped as I see no other comparable setting anywhere within the system that would cause the limitation.

By the way, the indexers are clustered (10) with SF=2 RF=3 on Red Hat 7.3 and using Splunk 7.3.1. Also, I saw that this type of configuration was shown in the 7.3.1 admin manual as an example (https://docs.splunk.com/Documentation/Splunk/7.3.1/Admin/Indexesconf). It leads me to believe it is possible to have a configuration that can share volumes. Example subset from Admin Manual below (using the same path):

### Indexes may be allocated space in effective groups by sharing volumes ###

# perhaps we only want to keep 100GB of summary data and other
# low-volume information
[volume:small_indexes]
path = /mnt/splunk_indexes
maxVolumeDataSizeMB = 100000

# and this is our main event series, allowing 50 terabytes
[volume:large_indexes]
path = /mnt/splunk_indexes
maxVolumeDataSizeMB = 50000000

Labels (1)
Tags (2)
0 Karma

yannK
Splunk Employee
Splunk Employee

"For the two indexes.conf volume settings below - would one take precedence over the other if they use the same path?"

 

I would recommend not use the same paths for volumes, and not to include one path into another. (and not include indexes paths into other indexes, or mix in the same folders volumes paths and non volume paths..)

Otherwise, the splunk measure of the disk used size will be all incorrect, and you may experience data frozen faster.

And you probably should see warnings in splunkd.log when you start the indexers, about risks on the paths conflicts.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...