Deployment Architecture

Does Splunk need to go through all the file after restart before indexing new events?

Path Finder


I would like to ask if anyone experience this or have a solution to this.

I have a forwarder which is reading many small log files (I mean in orders of millions) in a folder. Every time i made a change in configuration for a new index/.conf, I need to restart. And splunk will takes forever to goes through all these files before starting to index my new files.

Is there a way in the configuration to overcome this?

Thanks in advance.

0 Karma

Re: Does Splunk need to go through all the file after restart before indexing new events?

Ultra Champion

I have similar experiences. Especially when the log files are on network shares rather than locally, that can take quite some time (even with much smaller numbers).

For you case: Are all those log files still active, or does it also contain rotated old files that are inactive? If there's a lot of old inactive files, there is a few options you can look at:
- put some cleanup script in place to get rid of those old files after some time
- If the old files can be recognized from their name (e.g. they get a suffix when rotated), write your input stanza such that they are ignored.
- Use the ignoreOlderThan setting in inputs.conf to ignore old files

If all those files are actively being written to, you could perhaps look at enabling multiple pipelines on your forwarder (if the hardware specs allow that), to enable Splunk to process multiple files in parallel.

Speak Up for Splunk Careers!

We want to better understand the impact Splunk experience and expertise has has on individuals' careers, and help highlight the growing demand for Splunk skills.