We have a 12 node Hadoop Cluster and we are using splunk to index all log files (Hbase, Cloudera Manager, Jobtracker, name node, and other assortments of logs). Currently our NAS device where we store all the splunk indexes is getting 100% full (df -h command show 100%).
How can we limit the disk space being used by Splunk indexing to 50 gigs. We have tried 'clean event data' but it filled up within a week.
You can set the size of each Splunk index. When the index fills, the oldest data will automatically be rolled out (aka "frozen").
Go to the Splunk Manager and look at the maximum size set for each index. You should set this so that your disk space is not exceeded.
There is also a feature in Splunk that lets you create logical volumes for more detailed control of your disk space for indexes.
Read more in the documentation at Configure Index Size