My Splunk installation has indexed some files that weren't supposed to be indexed (dot files created by rsync), and now I'm seeing a Pool quota overage alert in Manager > Licensing > Licensing alerts.
The message states "please correct before midnight", but doesn't tell me how.
I can search for the unwanted events by filtering the source filed, and I could pipe the result to the Delete operator - but AFAIK, that has zero effect on the licensing.
So what exactly is Splunk encouraging me to do before midnight?
Splunk is warning you that if your situation isn't corrected, you may run into a license violation. If you don't correct the situation, it may carry over into the next day as well. As such, the proper course of action in order to correct would be to determine where the additional sources of data are coming from and to either disable the inputs or set up some type of null queue routing to keep the data from being indexed. Instructions for routing unwanted/unecessary data to the null queue can be found here:
http://docs.splunk.com/Documentation/Splunk/latest/Deploy/Routeandfilterdatad
You can find some useful searches here:
http://www.splunk.com/wiki/Community:TroubleshootingIndexedDataVolume
Removing the indexed data in a surgical way isn't possible. Data is aged out via retention policies that are based on age and size. Even if you did age the data out, splunk would still know the volume of data recorded for the day and you'd run into a violation.
You can "correct" and avoid a license violation only if you have other pools to move extra volume to the pool you have the warning. Or, if you have additional license you haven't used, you can add it to the pool. If the total of the license volume is larger than the day's indexed volume, you can avoid a license violation in the mid-night.
Many users use only one pool for all the licenses. In such case, unfortunately there is no way to do any action for "correct-before-midnight" warning.
Thanks for the details on how actually "correct" the issue (= there is nothing you can do if you do not have additional licenses)
Splunk is warning you that if your situation isn't corrected, you may run into a license violation. If you don't correct the situation, it may carry over into the next day as well. As such, the proper course of action in order to correct would be to determine where the additional sources of data are coming from and to either disable the inputs or set up some type of null queue routing to keep the data from being indexed. Instructions for routing unwanted/unecessary data to the null queue can be found here:
http://docs.splunk.com/Documentation/Splunk/latest/Deploy/Routeandfilterdatad
You can find some useful searches here:
http://www.splunk.com/wiki/Community:TroubleshootingIndexedDataVolume
Removing the indexed data in a surgical way isn't possible. Data is aged out via retention policies that are based on age and size. Even if you did age the data out, splunk would still know the volume of data recorded for the day and you'd run into a violation.
I downvoted this post because doesn't answer question
I downvoted this post because this doesn't actually answer the question. the correct answer is posted below.
You can delete records you don't want, but it doesn't recover your quota, and it doesn't recover disk space.
I always put a new data source to a "development" index first - that way if I make a mistake I can wipe the index without losing all of my other data.
So I just noticed that we incorrectly added the security log which I don't need in Splunk and that was millions of entries that I don't want to be indexed. This means that even though I removed that data input, I can't remove it from the index? We just installed it yesterday and I really don't want the security event logs in Splunk.
Yes, but often you find that you were indexing data you didn't really care about anyway. There is no way to 'un-index' data.
Thanks.
So, just to confirm, if my data has already been indexed (and I have already exceeded my indexing quota), there is nothing that I can do to "un-index" that data?
Wouldn't the nullQueue routing just avoid indexing future data?
If you've identified and stopped the flow of additional data, you've corrected the situation. That doesn't change the fact that you went over what was allowed for a particular queue.
This response doesn't provide an answer to the actual question of how to "correct before midnight".
I understand that Splunk is warning me because I have exceeded by index quota, but I would like to understand if there is a way for me to remove some of the indexed data so that I can stay below my daily quota. I have already identified where the additional data came from and I have stopped the flow of additional data.
How do I correct before midnight?