Deployment Architecture
Highlighted

Light Forwarder Data Loss Recovery

New Member

Hi,

In our environment we have configured the Splunk Light Forwarder to monitor a log file and forward raw data form this file via TCP on a port of some other machine. We have built a custom server to listen on this port on that machine and consume raw events sent by Splunk Forwarder. This set up is working fine without any issues.

However, if our server goes down for any reason, we are seeing data loss and Light Forwarder doesn't forward all events after our server went down. We see that Light Forwarder does forward some data after it recognizes that server went down but NOT all of it.

I was wondering if there is a way to force the Light Forwarder to start sending events from certain point of time in a file. From the documentation it seems like Light Forwarder keeps track of the position in a file. Is there a way to manipulate this to start from different position.

Thanks, Deepak Deshpande.

Tags (1)
0 Karma
Highlighted

Re: Light Forwarder Data Loss Recovery

Explorer

I have been playing around with a few LWFs and using the following I was able to regenerate the info from the forwarders. I would verify the "clean eventdata" command to make sure you don't lose any data.

./splunk stop

./splunk clean eventdata

./splunk start

0 Karma
Highlighted

Re: Light Forwarder Data Loss Recovery

Splunk Employee
Splunk Employee

problem with this might be that it will index all data, and hence create duplicates in the index.

0 Karma
Highlighted

Re: Light Forwarder Data Loss Recovery

Splunk Employee
Splunk Employee

Deepak,

After the indexer goes down, splunk forwarder should actually fully stop forwarding. If indexer is down, in splunkd.log of the forwarder you should be seeing errors regarding its connectivity with the indexer. This will in turn fill up the tcpout queue on the forwarder. Once the queue is full, the forwarder will stop trying to read the data altogether.

once the indexer is up again then the forwarder starts sending the data (the one in the queue first).

There is no reason for the forwarder to lose data.. (unless you stop/restart splunk while the indexer is down, which will cause the events that are in the queue (blocked) to drop and disappear)

Also, this

We see that Light Forwarder does forward some data after it recognizes that server went down but NOT all of it.

doesnt make any sense. If indexing server is down, how is the forwarder sending the data?

0 Karma
Highlighted

Re: Light Forwarder Data Loss Recovery

New Member

Thanks the reply. It takes some time for the forwarder to recognize that the remote port it is forwarding data is down. After recognizing that port is down, it will start queuing messages and resends them after port comes back up. But since there is a latency to recognize the port being down, all messages that were forwarded during this time would be lost. I don't think there is ACK feature built in forwarder to know for sure that message was forwarded. Hence, I was trying to find out if there is a way to start re-reading a file from point of time. Can you please let me know if there is a way?

0 Karma