Getting Data In

How does the forwarder handle large amounts of data?

ddrillic
Ultra Champion

An internal client is asking -

-- How often is the splunk forwarder reading data from the log files? does it ever sleep? if we had a log file fill up in a matter of seconds, could that data be lost?

Tags (4)
0 Karma
1 Solution

somesoni2
Revered Legend

Last answer is based on my experience.

1) Reading data is near real-time. Splunk forwarder has list of files/directories to be monitored and is looking for new events all the time. It doesn't sleep.
2)a. If a log file is filling up fast, but stays with same name location, no data would be lost. Splunk keeps track of till which point it has read the files and continues from there. If there is a spike in data, you would see some latency though, based on maxKBps on forwarder, network bandwidth and load on indexer/intermediate forwarders.
b. If a log file is filling up fast and getting renamed (rolled over) but staying there (one level rename) Splunk would still be reading content from that renamed file. Same limitation would apply as 2.a point.
c. If a log file is filling up fast and getting rolledover more than once (say original log file was myapp.log, it rolled over to myapp.log.1, creating a blank myapp.log file for next events. If myapp.log is filled again, the myapp.log.1 is renamed as myapp.log.2, current myapp.log is renamed as myapp.log.1 and a new myapp.log is created, and so on), and Splunk is not able to read the whole file, there will be data loss.

View solution in original post

0 Karma

somesoni2
Revered Legend

Last answer is based on my experience.

1) Reading data is near real-time. Splunk forwarder has list of files/directories to be monitored and is looking for new events all the time. It doesn't sleep.
2)a. If a log file is filling up fast, but stays with same name location, no data would be lost. Splunk keeps track of till which point it has read the files and continues from there. If there is a spike in data, you would see some latency though, based on maxKBps on forwarder, network bandwidth and load on indexer/intermediate forwarders.
b. If a log file is filling up fast and getting renamed (rolled over) but staying there (one level rename) Splunk would still be reading content from that renamed file. Same limitation would apply as 2.a point.
c. If a log file is filling up fast and getting rolledover more than once (say original log file was myapp.log, it rolled over to myapp.log.1, creating a blank myapp.log file for next events. If myapp.log is filled again, the myapp.log.1 is renamed as myapp.log.2, current myapp.log is renamed as myapp.log.1 and a new myapp.log is created, and so on), and Splunk is not able to read the whole file, there will be data loss.

0 Karma

ddrillic
Ultra Champion

Gorgeous - please convert to an answer.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...