I understand from the Splunk flat-file index-based architecture that if the storage space is not sufficient, Splunk won't be able to store the logs and index them. Again it shows a low disk space error in this scenario.
To address this issue, we would like to schedule a cleanup action for the log messages stored.
Please let us know if there is a way to schedule cleanup action to avoid storage memory issue in Splunk.
That isn't the Splunk way. What you need to do is constrain each index (or in aggregate) in
indexes.conf by SIZE (
maxDataSizeMB ), not only by date, although you can do that, too, so that it cannot grow beyond what you have predefined:
When you configure your Splunk indexes, there's a setting called frozenTimePeriodInSecs that controls the data retention of each index. If you have frozen storage defined, you can have the data age out to another solution, but if not then it is just removed from Splunk. However, since Splunk stores indexes in chunks called buckets and rolls data at the bucket level, the age out is based on the newest event in the index bucket that the data is actually stored in and thus the retention period might not seem exact depending on the data velocity and bucket sizes. By using this setting to set a data retention period you allow Splunk to manage moving out the data as it ages rather than a manual cleanup.
Another thing you can do to keep the disk from filling up if the data grows faster than your retention period is removing data is to set the parameters homePath.maxDataSizeMB and coldPath.maxDataSizeMB in the indexes.conf file so that when Splunk hits these limits it will age out the oldest data to keep some free disk space. You can set these parameters to ensure a set amount of space stays free.
Finally, are you collecting the filesystem capacity data from your Splunk infrastructure? You can also leverage that data and combine it with the predict command in a nice search to project when a particular indexer might run out of disk space based on its average growth rate so you have time to respond before the disk actually fills up.
We are using Splunk Enterprise as a service in cloud platform.
Can you please let me know , how to access the "indexes.conf" file. Do you mean we need to have access to Splunk Installation directory structure to edit this conf. file?
If you're using Splunk cloud, I think they generally manage settings that aren't configurable via the GUI. That said, I haven't heard of customers having to worry about disk utilization on their Splunk cloud indexers. You might want to engage support to inquire about those settings or see if you even need to worry about the infrastructure disk utilization when using their managed service.