Getting Data In

Splunk Store and Forward HA

ephemeric
Contributor

Hello all,

Forgive my hasty question, it's late and my articulation has dwindled along with my brain capacity...

We need a solution that allows us to not lose any events whatsoever on a collector, AKA heavy forwarder.

We can't do persistent queues across all inouts as splunktcp-ssl inputs do not allow this. We want all inputs to go straight into Splunk, no syslog relays etc.

The index and forward function won't work for us as it counts toward licence usage. How would one check to see what events were not acknowledged anyway? I assume something would have to be hacked into place to check which events were received?

I thought of writing something that monitors and then stops splunkd, copies over another outputs.conf (with no forward servers configured) and starts Splunk which indexes locally and similarly repeats when uplink is back.

I have noticed if all forward servers are removed from outputs.conf, either at start or via CLI one at a time then Splunk automatically starts to index locally on the fly.

This is ideal as it happens on the fly and no event loss I presume? This seems to be the closest solution I could find except that adding forward servers one at a time caused our data to be cloned in triplicate. Ouch!

We don't want to do cloning, hell no, we assume one uplink in each scenario.

We have three receivers (indexers) on the remote side but only one uplink.

I'm lost, how can we get Splunk to index locally ONLY when the uplink is unavailable and hence the event is not ack'ed and then merge those buckets/events out of band at a later stage?

It would be perfect if we could put all not ack'ed events into an index somehow on the localhost after a timeout and then when back online to forward those same events, get them ack'ed and clean out the local index.

This I know how, force a roll of the last hot and then scrub the ids and scp the warm buckets upstream into an indexer and merge and restart.

Thank you.

0 Karma

jrodman
Splunk Employee
Splunk Employee

If the heavy forwarder has access to a semi-persistent data source (log files) then it does this out of the box.

If the data source is something else (like udp input) then I encourage you to render these as logfiles eg via rsyslogd.

0 Karma

jrodman
Splunk Employee
Splunk Employee
  • WinEvent live channels should be made to behave similarly to to Monitor / Tailing, but without walking the code I have doubts that it correctly advances its event checkpoints in the same manner as tailing. For .evt files, Tailling tracks whether they have been read, so they should behave properly.
0 Karma

jrodman
Splunk Employee
Splunk Employee
  • splunktcp-ssl isn't really an original input, but a forwarding mechanism. Splunk forwarding, as discussed in your forwarding question, has the ack mechanism to ensure the datastream is handed off cleanly.
  • persistent-queues are not really a way to provide data redundancy, but instead a way to provide queue buffering elasticity.
0 Karma

ephemeric
Contributor

We could do that but what about the inputs that do not support persistent queues? Like splunktcp-ssl? Because of our high security client sites, we don't have any access to forwarders on hosts to change anything, like Windows event log buffers etc. We get to install it once and then hope for the best.

0 Karma

ephemeric
Contributor

We have those other inputs too. Don't want to mess with persistent queues.

0 Karma

bmacias84
Champion

Ya I dont see the problem that solution. It will work for everything expect streaming data such as perfmon, SNMP, UDP data, etc. Data resilance requires additional hardware (physical or virtual).

0 Karma

ephemeric
Contributor

We need something that can tolerate hours, maybe days of downtime.

0 Karma

bmacias84
Champion

use Indexer acknowledgement between the FW and Indexer. Then increase client output queue (will use more ram). Once the queue is full the FW will stop forwarding events. The FW will then pickup where it left of from once the queue has been processed.

0 Karma

ephemeric
Contributor

Thank you for the suggestion. Cool idea, but we cannot purchase and install more hardware at the client site, it's government...

0 Karma

bmacias84
Champion

Why not use multiple intermediate HFs with load balancing coupled with Indexer acknowledgement. The source Forwarder, intermediate, and Indexer would all have to acknowledge events before they are removed from the queues. You would then increase your queue sizes for input and output to an accept size before the HF stop processing new events. You could also increase your source forwarders output queue. Items will stay these queues until the indexer are function and acknowledging events. I can elaborate if needed.

0 Karma

ephemeric
Contributor

Another idea I had was to forward to the instance itself when the uplink was down, to a separate out-of-band index so to speak and then merge this later.

I couldn't get Splunk to forward to itself?

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...