Solved: how does the splunk GUI calculate the size of inde...

_pravin · ‎10-02-2023

Hi Community,

We have this wierd situation where one of the newest splunk installs (3 months old) went out of space - the capacity of the server was 500GB.

When I checked the size of each ondex in GUI, the size were all under limit. The sum of all were under 250 Gb, which made sense as the size of all index is set to 500GB (default). But when I calculated the size of the data models associated with the index, I could see that the data models had used almost 250Gb.

My understanding was that the data models should be also be included under the index capacity, but it seemed be exceeding the limits.

Can anyone please throw some light on this topic?

Regards,

Pravin

richgalloway · ‎10-02-2023

DMA data is stored in same location (by default) as the index the accelerated data came from, but is not included in the index size so is not covered by index size limits. When sizing an index, one should leave room on the storage device for DMA or use the tstatsHomePath setting in indexes.conf to put DMA output elsewhere.

---
If this reply helps you, Karma would be appreciated.

View solution in original post

_pravin · ‎10-02-2023

Hi @gcusello and @richgalloway,

One final question to get clarity about data models.

Let's assume I have an index that has data retention time of 1 month and a data model acceleration summary for 3 months. How will the data model act in this case.

Will data models have accelerated data that goes until 3 months or will the data models drop the data once the index drops them?

Regards,

Pravin

gcusello · ‎10-03-2023

Hi @_pravin

if you have a minor retention of your data, you can search data on the data model, but if you want to have a drilldown on raw data, it's possible only for a minor period.

Usually it's the contrary: search on data model on a minor or equal period than raw.

Ciao.

Giuseppe

richgalloway · ‎10-02-2023

Data models don't hold data past its retention period. To do that, use a summary index.

---
If this reply helps you, Karma would be appreciated.

gcusello · ‎10-02-2023

Hi @_pravin,

no Data Models are calculated in a separated way and, as @richgalloway said, they could be in a different location and have a different retention.

If you'r data Models use the same space of the index, probably you used in the Data Model also the _raw, and it isn't a best practice, because in the Data Model, you should have only the fields you need for your searches, not all the _raw.

Usually the space occupation for one year of an accelerated DataModel is around the daily license consuption for that index moltiplicated for 3.4.

Ciao.

Giuseppe

_pravin · ‎10-02-2023

Hi @gcusello ,

Our datamodels don't use the same space as in the index so the accelerated data don't have a cap on the limit.

I really liked your extended answer but could you please explain the line below in quotes, I find it a bit confusing.

"Usually the space occupation for one year of an accelerated DataModel is around the daily license consuption for that index moltiplicated for 3.4."

Regards,

Pravin

gcusello · ‎10-02-2023

Hi @_pravin ,

the disk space used for accelerated Data Models is usually calculated with this formula:

disk_space = dayly_used_license * 3.4

this formula is described in the Splunk Architecting training course.

So it's very strange that you have 250GB of index and 250 GB of Data Model.

This is possible only if you configured in your Data Model also the _raw field and this isn't a best practice becase in a Data Model you should have only the fields requested in your searches, not all the _raw of all events.

Ciao.

Giuseppe

_pravin · ‎10-02-2023

Sorry, I meant to say that the size of indexes (index1, index2, index 3, and so on) all together sums upto 250 GB. But the sizing case with datamodels was 250 Gb for 1 on them, 11GB of another, some megabytes for the next one., and so on.

Actually, the datamodel has only the requested field accelrated but the summary range is 1 year. This obviously makes sense for the growing size of data models.

Thanks,

Pravin

richgalloway · ‎10-02-2023

DMA data is stored in same location (by default) as the index the accelerated data came from, but is not included in the index size so is not covered by index size limits. When sizing an index, one should leave room on the storage device for DMA or use the tstatsHomePath setting in indexes.conf to put DMA output elsewhere.

---
If this reply helps you, Karma would be appreciated.

how does the splunk GUI calculate the size of indexes and data models associated with the index ?

data

index

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Splunk MCP & Agentic AI: Machine Data Without Limits

Join the Conversation

how does the splunk GUI calculate the size of indexes and data models associated with the index ?

data

index

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Splunk MCP & Agentic AI: Machine Data Without Limits