Forgive my hasty question, it's late and my articulation has dwindled along with my brain capacity...
We need a solution that allows us to not lose any events whatsoever on a collector, AKA heavy forwarder.
We can't do persistent queues across all inouts as splunktcp-ssl inputs do not allow this. We want all inputs to go straight into Splunk, no syslog relays etc.
The index and forward function won't work for us as it counts toward licence usage. How would one check to see what events were not acknowledged anyway? I assume something would have to be hacked into place to check which events were received?
I thought of writing something that monitors and then stops splunkd, copies over another outputs.conf (with no forward servers configured) and starts Splunk which indexes locally and similarly repeats when uplink is back.
I have noticed if all forward servers are removed from outputs.conf, either at start or via CLI one at a time then Splunk automatically starts to index locally on the fly.
This is ideal as it happens on the fly and no event loss I presume? This seems to be the closest solution I could find except that adding forward servers one at a time caused our data to be cloned in triplicate. Ouch!
We don't want to do cloning, hell no, we assume one uplink in each scenario.
We have three receivers (indexers) on the remote side but only one uplink.
I'm lost, how can we get Splunk to index locally ONLY when the uplink is unavailable and hence the event is not ack'ed and then merge those buckets/events out of band at a later stage?
It would be perfect if we could put all not ack'ed events into an index somehow on the localhost after a timeout and then when back online to forward those same events, get them ack'ed and clean out the local index.
This I know how, force a roll of the last hot and then scrub the ids and scp the warm buckets upstream into an indexer and merge and restart.
If the heavy forwarder has access to a semi-persistent data source (log files) then it does this out of the box.
If the data source is something else (like udp input) then I encourage you to render these as logfiles eg via rsyslogd.
We could do that but what about the inputs that do not support persistent queues? Like splunktcp-ssl? Because of our high security client sites, we don't have any access to forwarders on hosts to change anything, like Windows event log buffers etc. We get to install it once and then hope for the best.
Ya I dont see the problem that solution. It will work for everything expect streaming data such as perfmon, SNMP, UDP data, etc. Data resilance requires additional hardware (physical or virtual).
use Indexer acknowledgement between the FW and Indexer. Then increase client output queue (will use more ram). Once the queue is full the FW will stop forwarding events. The FW will then pickup where it left of from once the queue has been processed.
Why not use multiple intermediate HFs with load balancing coupled with Indexer acknowledgement. The source Forwarder, intermediate, and Indexer would all have to acknowledge events before they are removed from the queues. You would then increase your queue sizes for input and output to an accept size before the HF stop processing new events. You could also increase your source forwarders output queue. Items will stay these queues until the indexer are function and acknowledging events. I can elaborate if needed.
Another idea I had was to forward to the instance itself when the uplink was down, to a separate out-of-band index so to speak and then merge this later.
I couldn't get Splunk to forward to itself?