Currently evaluating Splunk with a view to buying a license, but for now we're on the unlicensed 500mb tier.
I installed a Universal Forwarder on both our DCs and it's basically taken us up to our 500mb daily quota because it looks like it's forwarded all historic event logs on the DCs as well as "from now onwards".
I'm not yet knowledgeable enough about Splunk to know the correct way to "clean" the old data - and presumably, even if I did, I'm not changing that it's ingested and indexed 500mb today so I'm still over the quota.
It would be good to know for the next time I add a Windows server though.
The free Splunk license allows for 3 days of overage in any 30 day rolling period of overage. (see here). The small amount of "forgiveness" for the occasional violation is very useful for exactly what you describe: turning on a new input that either a) you didn't really know how much data you were going to ingest, or b) grabbing history that first time.
Keep your eye on the license tomorrow, if it looks like it will stay under 500 MB/day, then you should be OK. 2 DCs in a smallish environment is possibly under 500 MB/day, though obviously every environment and system is different. If you search, you can find all sorts of useful license searches in Answers, and IIRC there's still an app that throws a bunch of useful dashboards together. I've found the built in license reports enough for my needs, though, especially now with some of the new Distributed Management Console stuff.
You might, depending on what you are using the data for, be able to turn on "renderXML" (search for it in this document) , which reduces ingest by more than half. Otherwise, your only other option is to blacklist some events or otherwise limit what you are ingesting: this link may get you started.
You might be able to use the ignoreOlderThan setting in the inputs.conf, which controls how long ago the file has to have been last edited before Splunk ignores it altogether. The default setting of 0 causes your current issue of Splunk collecting and indexing all the historical data you give it.
Another option, if you don't care about losing data during downtime, is to set current_only to 1 (not recommended), which will cause Splunk to only collect data created while the forwarder is running.
I think maybe you have to redo the indexes.
If you are monitoring files and indexing it, the index grows up every day, but it keeps indexing the old data adding the new data, I mean, if you have an index of 42 mb and every day you are indexing data, it will index these 42mb plus the data from day by day.
I had the same problem, and now with a license, I'm keeping doing the same thing as I was waiting my license.
I cleaned all the indexes and created a new ones. For example, for alerts, I deleted the one that I had, and I created a new one for each year (2015, 2014...) the past year, I've indexed them as static, which is the option "index once" while you are adding data, and for this year, I have one that is being indexing every day.
With this, I keep the old data, but It just consume mb one day (the day I indexed it) and the current year is growing but the space it consumes is lower than the others.
Be careful anyway to not index all the data in one day, pay atention to the limit, but It worked for me 🙂