I need help on understanding something.
If I have a folder being monitored by a universal forwarder and I lose the connectivity between the universal forwarder and the indexer. Does that cause the loss of all the data that arrives on the monitored folder during connection loss time ?
Do i need to use Ack to ensure that universal ?
In which case do I need to have Ack enabled ? It's only when i have streams/scripts arriving on the forwarder without storing the results on disk right ?
Thank you for your answers, the cases stated in your link are for scripted inputs where data isn't written to disk. In my case I have the data on files already and the files are configured with log rotation so if the connectivity is lost, the data will remain in those files, I don't see the use of ack+queues in this case. Doesn't the universal forwarder set some sort of pointer to follow which data in the file has been send and which hasn't ?
The data acknowledgement works same way regardless of the type of input. Splunk forwarder does keep track of which data has been sent, but without the useAck, it doesn't know which data has been received by indexer, and can cause data loss if UF/Indexer is down/no connection.
But forwarding is tcp based, so how can we have data loss for file based forwarding if we're sending one tcp window size at a time ?
When the UF completely looses connection: you're right, it will just stop sending (and reading once queues fill up) and continue where it left off once connection is restored.
But when the UF is able to send out the event, but the indexer is unable to actually index it (e.g. splunk crashes or so during processing) you could still loose data when ACK is not enabled.
I am also having same query if data is travelling over tcp port, and don't use data acknowledgement setting then there will be data loss ?
Hi Nilbak, did some digging around and here's what I found : tcp will guarantee that your data crosses across the network but doesn't guarantee the reception at application level, so on layer 4 you will receive data but data might be missing at app layer. Ack adds this validation of data at the reception but might cause duplicates and slows down perf, I would avoid it if it's not a strict requirement.
@DavidHourani, why would it cause duplicates? Btw, the Splunk Architect classes emphasize that acknowledgement should be used.