Getting Data In

How does the forwarder handle large amounts of data?

ddrillic
Ultra Champion

An internal client is asking -

-- How often is the splunk forwarder reading data from the log files? does it ever sleep? if we had a log file fill up in a matter of seconds, could that data be lost?

Tags (4)
0 Karma
1 Solution

somesoni2
Revered Legend

Last answer is based on my experience.

1) Reading data is near real-time. Splunk forwarder has list of files/directories to be monitored and is looking for new events all the time. It doesn't sleep.
2)a. If a log file is filling up fast, but stays with same name location, no data would be lost. Splunk keeps track of till which point it has read the files and continues from there. If there is a spike in data, you would see some latency though, based on maxKBps on forwarder, network bandwidth and load on indexer/intermediate forwarders.
b. If a log file is filling up fast and getting renamed (rolled over) but staying there (one level rename) Splunk would still be reading content from that renamed file. Same limitation would apply as 2.a point.
c. If a log file is filling up fast and getting rolledover more than once (say original log file was myapp.log, it rolled over to myapp.log.1, creating a blank myapp.log file for next events. If myapp.log is filled again, the myapp.log.1 is renamed as myapp.log.2, current myapp.log is renamed as myapp.log.1 and a new myapp.log is created, and so on), and Splunk is not able to read the whole file, there will be data loss.

View solution in original post

0 Karma

somesoni2
Revered Legend

Last answer is based on my experience.

1) Reading data is near real-time. Splunk forwarder has list of files/directories to be monitored and is looking for new events all the time. It doesn't sleep.
2)a. If a log file is filling up fast, but stays with same name location, no data would be lost. Splunk keeps track of till which point it has read the files and continues from there. If there is a spike in data, you would see some latency though, based on maxKBps on forwarder, network bandwidth and load on indexer/intermediate forwarders.
b. If a log file is filling up fast and getting renamed (rolled over) but staying there (one level rename) Splunk would still be reading content from that renamed file. Same limitation would apply as 2.a point.
c. If a log file is filling up fast and getting rolledover more than once (say original log file was myapp.log, it rolled over to myapp.log.1, creating a blank myapp.log file for next events. If myapp.log is filled again, the myapp.log.1 is renamed as myapp.log.2, current myapp.log is renamed as myapp.log.1 and a new myapp.log is created, and so on), and Splunk is not able to read the whole file, there will be data loss.

0 Karma

ddrillic
Ultra Champion

Gorgeous - please convert to an answer.

0 Karma
Get Updates on the Splunk Community!

Splunk Forwarders and Forced Time Based Load Balancing

Splunk customers use universal forwarders to collect and send data to Splunk. A universal forwarder can send ...

NEW! Log Views in Splunk Observability Dashboards Gives Context From a Single Page

Today, Splunk Observability releases log views, a new feature for users to add their logs data from Splunk Log ...

Last Chance to Submit Your Paper For BSides Splunk - Deadline is August 12th!

Hello everyone! Don't wait to submit - The deadline is August 12th! We have truly missed the community so ...