Deployment Architecture

Does Splunk need to go through all the file after restart before indexing new events?

quahfamili
Path Finder

Hi,

I would like to ask if anyone experience this or have a solution to this.

I have a forwarder which is reading many small log files (I mean in orders of millions) in a folder. Every time i made a change in configuration for a new index/.conf, I need to restart. And splunk will takes forever to goes through all these files before starting to index my new files.

Is there a way in the configuration to overcome this?

Thanks in advance.
Alan

0 Karma

FrankVl
Ultra Champion

I have similar experiences. Especially when the log files are on network shares rather than locally, that can take quite some time (even with much smaller numbers).

For you case: Are all those log files still active, or does it also contain rotated old files that are inactive? If there's a lot of old inactive files, there is a few options you can look at:
- put some cleanup script in place to get rid of those old files after some time
- If the old files can be recognized from their name (e.g. they get a suffix when rotated), write your input stanza such that they are ignored.
- Use the ignoreOlderThan setting in inputs.conf to ignore old files

If all those files are actively being written to, you could perhaps look at enabling multiple pipelines on your forwarder (if the hardware specs allow that), to enable Splunk to process multiple files in parallel.

Get Updates on the Splunk Community!

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

Splunk Decoded: Business Transactions vs Business IQ

It’s the morning of Black Friday, and your e-commerce site is handling 10x normal traffic. Orders are flowing, ...

Fastest way to demo Observability

I’ve been having a lot of fun learning about Kubernetes and Observability. I set myself an interesting ...