Hi Community,
We have this wierd situation where one of the newest splunk installs (3 months old) went out of space - the capacity of the server was 500GB.
When I checked the size of each ondex in GUI, the size were all under limit. The sum of all were under 250 Gb, which made sense as the size of all index is set to 500GB (default). But when I calculated the size of the data models associated with the index, I could see that the data models had used almost 250Gb.
My understanding was that the data models should be also be included under the index capacity, but it seemed be exceeding the limits.
Can anyone please throw some light on this topic?
Regards,
Pravin
DMA data is stored in same location (by default) as the index the accelerated data came from, but is not included in the index size so is not covered by index size limits. When sizing an index, one should leave room on the storage device for DMA or use the tstatsHomePath setting in indexes.conf to put DMA output elsewhere.
Hi @gcusello and @richgalloway,
One final question to get clarity about data models.
Let's assume I have an index that has data retention time of 1 month and a data model acceleration summary for 3 months. How will the data model act in this case.
Will data models have accelerated data that goes until 3 months or will the data models drop the data once the index drops them?
Regards,
Pravin
Hi @_pravin
if you have a minor retention of your data, you can search data on the data model, but if you want to have a drilldown on raw data, it's possible only for a minor period.
Usually it's the contrary: search on data model on a minor or equal period than raw.
Ciao.
Giuseppe
Data models don't hold data past its retention period. To do that, use a summary index.
Hi @_pravin,
no Data Models are calculated in a separated way and, as @richgalloway said, they could be in a different location and have a different retention.
If you'r data Models use the same space of the index, probably you used in the Data Model also the _raw, and it isn't a best practice, because in the Data Model, you should have only the fields you need for your searches, not all the _raw.
Usually the space occupation for one year of an accelerated DataModel is around the daily license consuption for that index moltiplicated for 3.4.
Ciao.
Giuseppe
Hi @gcusello ,
Our datamodels don't use the same space as in the index so the accelerated data don't have a cap on the limit.
I really liked your extended answer but could you please explain the line below in quotes, I find it a bit confusing.
"Usually the space occupation for one year of an accelerated DataModel is around the daily license consuption for that index moltiplicated for 3.4."
Regards,
Pravin
Hi @_pravin ,
the disk space used for accelerated Data Models is usually calculated with this formula:
disk_space = dayly_used_license * 3.4
this formula is described in the Splunk Architecting training course.
So it's very strange that you have 250GB of index and 250 GB of Data Model.
This is possible only if you configured in your Data Model also the _raw field and this isn't a best practice becase in a Data Model you should have only the fields requested in your searches, not all the _raw of all events.
Ciao.
Giuseppe
Sorry, I meant to say that the size of indexes (index1, index2, index 3, and so on) all together sums upto 250 GB. But the sizing case with datamodels was 250 Gb for 1 on them, 11GB of another, some megabytes for the next one., and so on.
Actually, the datamodel has only the requested field accelrated but the summary range is 1 year. This obviously makes sense for the growing size of data models.
Thanks,
Pravin
DMA data is stored in same location (by default) as the index the accelerated data came from, but is not included in the index size so is not covered by index size limits. When sizing an index, one should leave room on the storage device for DMA or use the tstatsHomePath setting in indexes.conf to put DMA output elsewhere.