Our standard universal forwarders, at the moment, specify in outputs.conf
all the indexers of the cluster we have in the [tcpout:indexers]
stanza, such as - server = host1:9997,host2:9997,...
.
We don't have the indexer acknowledgment enabled - does it mean that when an indexer goes down for a day, let's say, we lose data?
Yes and no.
Your in-flight data can be at risk, depending on how an indexer crashes. That'd be events sent by the forwarder, but not fully written to disk (or better: replicated to peers in an indexer cluster) when the crash happens.
With indexer acknowledgement, this in-flight data will be repeated to another indexer by the forwarder because it doesn't get its ack.
If the indexer stays down for a day you won't lose a day of data though. The forwarders will not send further events to the crashed indexer, and instead fail over to the other indexers.
Your on-disk data can be at risk if the indexer crashes in a way that damages data on disk, e.g. catastrophic hardware failure, and if you don't have replication in an indexer cluster... and assuming the catastrophic failure isn't large enough to take out other peers or sites too 😉
Yes and no.
Your in-flight data can be at risk, depending on how an indexer crashes. That'd be events sent by the forwarder, but not fully written to disk (or better: replicated to peers in an indexer cluster) when the crash happens.
With indexer acknowledgement, this in-flight data will be repeated to another indexer by the forwarder because it doesn't get its ack.
If the indexer stays down for a day you won't lose a day of data though. The forwarders will not send further events to the crashed indexer, and instead fail over to the other indexers.
Your on-disk data can be at risk if the indexer crashes in a way that damages data on disk, e.g. catastrophic hardware failure, and if you don't have replication in an indexer cluster... and assuming the catastrophic failure isn't large enough to take out other peers or sites too 😉
Makes perfect sense Martin !!!
You could instruct the forwarder to clone the data to two indexers, but that's probably not what you want. The two receiving indexers would not later de-duplicate against each other, each event would be indexed, licensed, and searched twice.
If you want high availability without risk for in-flight data you want indexer clustering with replication and indexer acknowledgement, it's what they're there for.
I'm not 100% sure if there are additional things on the application level (probably), but at least at the TCP level the forwarder will know something's wrong.
Great - and the forwarder sends the data to only one active indexer? If so, is it possible to configure the forwarder to send data to two active indexers?
Perfect. I get it about in-flight data.
You said -
-- The forwarders will not send further events to the crashed indexer, and instead fail over to the other indexers.
What's the mechanism here? how does the forwarder know not to send data to this indexer for the time being?
Ok, I see - it won't send data to a down server...
guys, there is something called Load Balancing, UF / HF by default act as a Load Balancer, so if one indexer is down , automatically data goes through the other Indexer.
Load Balance concept is , first of all it sends a heart beat to the server, if it does not get a response back from the server at a specific time , LB will think this server is dead and try to send the data to other server.
Thanks !!
Cheers,