Is there any recommended mechanism to bring a new server, or new "source" online, where there may be historic data, thus avoiding a temporary "breach" of licensing?
i.e., I bring in, a 2 year old server, with data that's never been indexed, and say it's been logging 50MB per day.
50MB x 365Days x 2Years = 36500MB or 36GB.
Going forwards, it would only ever log 50MB per day, so shouldn't be an issue, but indexing that initial backlog would put me well over, albeit temporary.
Likewise, if a "Noisy" forwarder was offline, or down for a while, is there any way to bring that back online without going over?
Basically, just bring it back online, exceed your license for the day, and move on. You can exceed a free license 3 times in 30 days without losing your ability to search, and you can do the same 5 times on an enterprise license. So the only real recommendation is to start the process at 12:01 AM server time, to make the best use of your license exceptions (or at least, don't start it at 11 PM).
Check this out for more: http://splunk-base.splunk.com/answers/322/what-happens-when-i-exceed-my-licensed-limit
You can also limit the amount of data the forwarder can send in limits.conf
* If specified and not zero, this limits the speed through the thruput processor to the specified
rate in kilobytes per second.
* To control the CPU load while indexing, use this to throttle the number of events this indexer
processes to the rate (in KBps) you specify. `
While it specifically mentions CPU load, I've seen several people using this to limit data for just this purpose.