Solved: How does the forwarder handle large amounts of dat...

ddrillic · ‎07-25-2017

An internal client is asking -

-- How often is the splunk forwarder reading data from the log files? does it ever sleep? if we had a log file fill up in a matter of seconds, could that data be lost?

somesoni2 · ‎07-25-2017

Last answer is based on my experience.

1) Reading data is near real-time. Splunk forwarder has list of files/directories to be monitored and is looking for new events all the time. It doesn't sleep.
2)a. If a log file is filling up fast, but stays with same name location, no data would be lost. Splunk keeps track of till which point it has read the files and continues from there. If there is a spike in data, you would see some latency though, based on maxKBps on forwarder, network bandwidth and load on indexer/intermediate forwarders.
b. If a log file is filling up fast and getting renamed (rolled over) but staying there (one level rename) Splunk would still be reading content from that renamed file. Same limitation would apply as 2.a point.
c. If a log file is filling up fast and getting rolledover more than once (say original log file was myapp.log, it rolled over to myapp.log.1, creating a blank myapp.log file for next events. If myapp.log is filled again, the myapp.log.1 is renamed as myapp.log.2, current myapp.log is renamed as myapp.log.1 and a new myapp.log is created, and so on), and Splunk is not able to read the whole file, there will be data loss.

View solution in original post

somesoni2 · ‎07-25-2017

Last answer is based on my experience.

1) Reading data is near real-time. Splunk forwarder has list of files/directories to be monitored and is looking for new events all the time. It doesn't sleep.
2)a. If a log file is filling up fast, but stays with same name location, no data would be lost. Splunk keeps track of till which point it has read the files and continues from there. If there is a spike in data, you would see some latency though, based on maxKBps on forwarder, network bandwidth and load on indexer/intermediate forwarders.
b. If a log file is filling up fast and getting renamed (rolled over) but staying there (one level rename) Splunk would still be reading content from that renamed file. Same limitation would apply as 2.a point.
c. If a log file is filling up fast and getting rolledover more than once (say original log file was myapp.log, it rolled over to myapp.log.1, creating a blank myapp.log file for next events. If myapp.log is filled again, the myapp.log.1 is renamed as myapp.log.2, current myapp.log is renamed as myapp.log.1 and a new myapp.log is created, and so on), and Splunk is not able to read the whole file, there will be data loss.

ddrillic · ‎07-25-2017

Gorgeous - please convert to an answer.

How does the forwarder handle large amounts of data?

Can’t make it to .conf25? Join us online!

What Is Splunk? Here’s What You Can Do with Splunk

Level Up Your .conf25: Splunk Arcade Comes to Boston

Manual Instrumentation with Splunk Observability Cloud: How to Instrument Frontend ...

Are you a member of the Splunk Community?

How does the forwarder handle large amounts of data?

Can’t make it to .conf25? Join us online!

What Is Splunk? Here’s What You Can Do with Splunk

Level Up Your .conf25: Splunk Arcade Comes to Boston

Manual Instrumentation with Splunk Observability Cloud: How to Instrument Frontend ...