We currently have two different Splunk environments - one for Production data and one for UAT data. Both sets of data are of significant use to us.
I would like to consolidate these two environments into one (albeit across a four indexers, as we were already planning a major re-architecture of our Splunk system) to centralise the configurations along with many other benefits, and I can partition the different data via indexes instead of servers.
There is one problem with this: Our UAT environment is subject to un-announced load tests where on certain days the log volume will increase up to 15-fold. Consolidating onto a single system means that I will lose the ability to set a license limit (via license pooling) on the UAT servers only; as far as I can tell, there is no way to limit licence usage to specific indexes. It is definitely possible that this could happen more than 5 times in a 30 day period, which could impact our entire Splunk system by causing max license violations and blocking search on all data including production.
What are my options for avoiding this potential issue? Is there a way I can partition or isolate this problem/high volume data source within the system?
FYI we have a multi-hundred Gb Splunk license for our organisation.
If the goal is to prevent data to be indexed to avoid license violations, a very manual solution is :
They are probably other methods using the license pools,
Enhancement request 81027 made.
Please do make an enhancement request for per-index license limitations, and describe your use case.
Update :
Details are on this wiki page : http://wiki.splunk.com/Community:TroubleshootingIndexedDataVolume
remark :
License_usage.log is available in the Splunk license master instance only. A license master logs indexed events volume every minute by the information the slaves send to the master. A slave maintains a table of how much you've indexed on a slave in chunks of time. Typically that chunk of time is 1 minute, but the chunk may grow if the slave cannot contact the master -- Splunk only resets the chunk when the table is sent to the master. The table is of src,srctype,host tuples… if that table grows to exceed 1000 entries, then Splunk squashes the host/source keys. So, if you have more than 1000 different tuple entries, you find no value for h(ost) and s(ource) fields. Splunk never suppresses st(sourcetype) in the log.
If the goal is to prevent data to be indexed to avoid license violations, a very manual solution is :
They are probably other methods using the license pools,
That is a novel idea actually. It does unfortunately have a bit of manual intervention, but definitely worth a look. Thanks for taking the time to think about it. Hopefully there will be some other suggestions, but if not I'll mark this as answered in a few days.