Getting Data In

How does the forwarder handle large amounts of data?

Ultra Champion

An internal client is asking -

-- How often is the splunk forwarder reading data from the log files? does it ever sleep? if we had a log file fill up in a matter of seconds, could that data be lost?

Tags (4)
0 Karma
1 Solution

SplunkTrust
SplunkTrust

Last answer is based on my experience.

1) Reading data is near real-time. Splunk forwarder has list of files/directories to be monitored and is looking for new events all the time. It doesn't sleep.
2)a. If a log file is filling up fast, but stays with same name location, no data would be lost. Splunk keeps track of till which point it has read the files and continues from there. If there is a spike in data, you would see some latency though, based on maxKBps on forwarder, network bandwidth and load on indexer/intermediate forwarders.
b. If a log file is filling up fast and getting renamed (rolled over) but staying there (one level rename) Splunk would still be reading content from that renamed file. Same limitation would apply as 2.a point.
c. If a log file is filling up fast and getting rolledover more than once (say original log file was myapp.log, it rolled over to myapp.log.1, creating a blank myapp.log file for next events. If myapp.log is filled again, the myapp.log.1 is renamed as myapp.log.2, current myapp.log is renamed as myapp.log.1 and a new myapp.log is created, and so on), and Splunk is not able to read the whole file, there will be data loss.

View solution in original post

0 Karma

SplunkTrust
SplunkTrust

Last answer is based on my experience.

1) Reading data is near real-time. Splunk forwarder has list of files/directories to be monitored and is looking for new events all the time. It doesn't sleep.
2)a. If a log file is filling up fast, but stays with same name location, no data would be lost. Splunk keeps track of till which point it has read the files and continues from there. If there is a spike in data, you would see some latency though, based on maxKBps on forwarder, network bandwidth and load on indexer/intermediate forwarders.
b. If a log file is filling up fast and getting renamed (rolled over) but staying there (one level rename) Splunk would still be reading content from that renamed file. Same limitation would apply as 2.a point.
c. If a log file is filling up fast and getting rolledover more than once (say original log file was myapp.log, it rolled over to myapp.log.1, creating a blank myapp.log file for next events. If myapp.log is filled again, the myapp.log.1 is renamed as myapp.log.2, current myapp.log is renamed as myapp.log.1 and a new myapp.log is created, and so on), and Splunk is not able to read the whole file, there will be data loss.

View solution in original post

0 Karma

Ultra Champion

Gorgeous - please convert to an answer.

0 Karma