Getting Data In

How to force splunk to index new files quickly?


How do I force splunk to index new files in the directory that is being monitored immediately? sometimes it takes really long for it to detect/index new files.



Tags (1)

Splunk Employee
Splunk Employee

How many files in the directory (and below), how many are "actively" written, and what version of Splunk is the forwarder (assuming there is a forwarder, otherwise the component that reads the files)

0 Karma

Splunk Employee
Splunk Employee

How many files are you monitoring in that directory? Is it a sub directory of a parent directory being monitored?

If there are lots (hundreds/thousands) of files in the directory, then Splunk has to cycle through all of them to detect change. If it makes sense, applying a whitelist for more targeted monitoring may help with speeding change detection. Or configure a single input per file/source in the directory instead of monitoring the entire directory.

Additionally, a more drastic approach is to consider increasing the number of file descriptors Splunk uses for monitoring inputs. The default is 32 FDs, which means Splunk uses a sliding window of 32 files to check for change at any given time. Try increasing (doubling or tripling) this to see if it helps. But you should try the other options first.

This parameter is controlled in limits.conf:

max_fd = <integer>
* Maximum number of file descriptors that Splunk can use in the Select Processor.
* The maximum value honored is half the current number of allowed file descriptors per process. (ulimit -n /setrlimit NOFILES)
* If a value chosen is higher than the maximum allowed value, the maximum value is used instead.
* Defaults to 32.

Splunk Employee
Splunk Employee

The 'cycling' behavior has changed a lot in version 4.1+ (from 4.0.x and down). From there, files that have not been updated recently are checked less and less frequently. Thus you can have many static or old rolled copies in the directory with minimal performance impact. Of course, this could work against you in perverse circumstances, but in practice is fine.

0 Karma
Get Updates on the Splunk Community!

Data Preparation Made Easy: SPL2 for Edge Processor

By now, you may have heard the exciting news that Edge Processor, the easy-to-use Splunk data preparation tool ...

Introducing Edge Processor: Next Gen Data Transformation

We get it - not only can it take a lot of time, money and resources to get data into Splunk, but it also takes ...

Tips & Tricks When Using Ingest Actions

Tune in to learn about:Large scale architecture when using Ingest ActionsRegEx performance considerations ...