I have some questions regarding data trim.
From which version data trim has been added?
What is the parameter to trim the data like how much storage used be filled in order to do the data trim?
Can we stop data trim? or how can we know that data is about to get trim
I am talking about when you have given particular size for your index and a retention period . so if the data overloads your storage size then Splunk intelligence starts trimming old data from cold bucket so that you have storage for your new data.
hope this explanation helps.
Hi @Siddharthnegi,
yes, I completely misunderstood your question!
Anyway, when a bucket exceed the retention time and you didn't configured a string to offline save it, it is discarded, but only when the earliest event exceeds te retention period.
For this reason, you can have events that exceed the retention period because they are in the same bucket containing events not exceeding the retention period.
For this reason it's a best practice store in the same index events with almost the same ingestion frequency.
Buckets are also discarded when your index reaches the max size, when this occurs, the older bucket is discarded, only one by one, until the Index again reaches the max size.
Ciao.
Giuseppe
I have some questions regarding data trimming
like from which version of splunk this feature is added.
Hi @Siddharthnegi,
I work on Splunk from version 4 and it was always present, I cannot answer for the previous versions.
Ciao.
Giuseppe
ok , and when does it trim the data like how much storage have to be filled in order for splunk to trim old data,
is there any parameters for trimmimg?
Hi @Siddharthnegi ,
you can define:
Here you can find useful information https://www.splunk.com/en_us/blog/tips-and-tricks/managing-index-sizes-in-splunk.html?locale=en_us
Let me know if you need more help, otherwise, please accept the answer for the other people of Community.
Ciao.
Giuseppe
P.S.: Karma Points are appreciated 😉
i am asking when splunk trim the old data from cold bucket, when does it do that? like how much data have to be filled in index for splunk to trim that.
let i have a index to which 500gb is allocated . now when splunk trims the data , how much of 500gb should be filled in order fro splunk to do that.
Hi @Siddharthnegi,
as I said, trimming for Index size exceeding occurs when the index reach the max size and the oldest bucket is deleted.
In this way the Index has a different dimension so it continues to grow until it again reaches the max size, so it deletes again the oldest buck and so on.
by default, In average a bucket has a dimension of 10 GB, so this should be the trimmed size.
about the question when: trimmig is performed when the max size is reached.
Ciao.
Giuseppe
so if only 10gb of storage is remaining of index then splunk starts the trimming of old data?
Hi @Siddharthnegi,
no, trimming starts when the Index reaches the max size.
After trimming, the index probably will have 490 GB, so it continues to grow until it reaches again the max size, so the trimming process restart.
ciao.
Giuseppe
so it trim 10gb data everytime storage is filled .
Hi @Siddharthnegi,
yes: Splunk trims the oldest bucket, that usually (by default) had a dimension of 10 GB.
Ciao.
Giuseppe
Thanks for your answer, however, we are facing an issue where there is enough space in our index but our disk space has reached around 80%. SO I just want to know if volume trimming happens on the disk level as well ? Below attached are our index configuration for paloalto index and the disk status.
[firewall_paloalto]
coldPath = volume:cold\firewall_paloalto\colddb
homePath = volume:hotwarm\firewall_paloalto\db
thawedPath = D:\splunk_data\firewall_paloalto\thaweddb
tstatsHomePath = volume:hotwarm\firewall_paloalto\datamodel_summary
frozenTimePeriodInSecs = 47304000
maxTotalDataSizeMB = 4294967295
can we increase this 10gb margin for data trim and can we know before that splunk is about to trim the data so that we would know that data is going to be trimmed..
Hi @Siddharthnegi,
I don't like to change this kind of parameters, also because you only move the problem, you don't solve it also giving a max size of 20 GB, when trimming you have 20 GB od disk space, what is changed?
in other words, what's the problem?
Splunk automatically trims the oldest bucket when it reaches the max size,
In my opinion, the most important aspect to analyze is:
is the max index size trimming approach compatible with your retention policy?
in other words, if you need to retain events for 90 days, maybe (this must be checked), are events in this period trimmed?
because there's the risk to trim events in the retention period.
I usually don't use the max size approach for trimming but only the retention period to avoid to trim events that I need.
Ciao.
Giuseppe
Ok i understood,
but still can we change it ?
also how to prevent it or to know before it happens
Hi @Siddharthnegi,
as I said, you have to prevently design your storage defining for each index the maximum storage required, so
if for an index you are ingesting 70 GB/day and you want a retention of 90 days, you need of
70 * 0.5 * 90 = 3150 GB available
and giving a margin of 10% you neew around 3.5TB of disk space.
making the same design for each index you have your storage requirements.
You could also see the average of license consuption and use only that value for the calculation.
Ciao.
Giuseppe
Thanks for your answer, however, we are facing an issue where there is enough space in our index but our disk space has reached around 80%. SO I just want to know if volume trimming happens on the disk level as well ? Below attached are our index configuration for paloalto index and the disk status.
[firewall_paloalto]
coldPath = volume:cold\firewall_paloalto\colddb
homePath = volume:hotwarm\firewall_paloalto\db
thawedPath = D:\splunk_data\firewall_paloalto\thaweddb
tstatsHomePath = volume:hotwarm\firewall_paloalto\datamodel_summary
frozenTimePeriodInSecs = 47304000
maxTotalDataSizeMB = 4294967295
Hi @Siddharthnegi,
are you speaking of data truncating, to limit the lenght of too long events or the full events filtering and deletion?
if the data truncating, you can use the TRUNCATE = 1000 (default is 10.000) in your props.conf (for more infos see at https://docs.splunk.com/Documentation/Splunk/9.1.0/Admin/Propsconf ), for my knowledge it is in Splunk from the first releases.
if you're speaking of event filtering, see at https://docs.splunk.com/Documentation/Splunk/9.1.0/Forwarding/Routeandfilterdatad
Ciao.
Giuseppe