Getting Data In

How to prevent directory bombs on forwarders?

twinspop
Influencer

Spent all day yesterday trying to figure out why a client's logs weren't indexing. Most of the time I had no access to the server in question, so I was simply troubleshooting from internal logs, configs, and the sporadic logs that would show up briefly after a restart.

Finally, when I was just about to throw in the towel, I started poking around directories above the target files. The monitor line had an asterisk at this point in the path, so even though most other dirs didn't match further down the line, Splunk had to check them. Several of them had 100k+ files in them. So Splunk was stuck trying to read these dirs. Even performing just an ls | wc -l took over 10 minutes on a few of them.

I can find big directories with something like this and send it into Splunk for alerting:

find /path -size +100k -type d

Adjusting the size requirement as needed. Is there a better way to avoid these landmines?

Thanks,
Jon

0 Karma

yannK
Splunk Employee
Splunk Employee

It's hard to avoid the scan of all the files in the potential path when you have wildcard.

To avoid indexing unnecessary files in sub folders, maybe disable the recursive indexing of sub folders ?
recursive = true|false
see http://docs.splunk.com/Documentation/Splunk/7.0.0/Data/Monitorfilesanddirectorieswithinputs.conf

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Monitoring AI Agents with Splunk Observability Cloud

Let’s say I’m running a travel planning AI app in production. A user asks for three concise hotel options in ...

[Puzzles] Solve, Learn, Repeat: Tiling

This puzzle (first published here) is based on finding groups of tessellated tiles (inspired by floor tiles I ...

SOK it to Me: Top 3 Benefits of Using Splunk Operator on Kubernetes that’ll Make ...

    Thursday, July 9, 2026  |  11:00AM–12:00PM PDT Duration: 1 hour (includes Q&A) Managing can feel like a ...