Getting Data In
Highlighted

Excluding folders with monitor input

Path Finder

I have a folder containing logs as below. I want to exclude all directories not named DONTINDEX_* and index the contents of every other subfolder of 'logs'.

logs\  
    a\  
    b\  
    DONTINDEX_a\  
    c\
    DONTINDEX_b\  
    d\

Using _blacklist inside the [monitor:///logs] stanza works but Splunk still scans the DONTINDEX* folders. Problem is that those folders contain 100k+ small files which slows indexing & places a heavy load on the source server.

How do I exclude the entire folder without specifying a separate monitor stanza for each folder I want to scan (a,b,c,d)?

I'm running Splunk 4.1.2 on Windows 2008 R2.

Tags (2)
Highlighted

Re: Excluding folders with monitor input

Champion

The following section of the documentation provides an example of how to exclude entire directories using a blacklist. For example:

[monitor:///mnt/logs]
    blacklist = (DONTINDEX_a|DONTINDEX_b)
0 Karma
Highlighted

Re: Excluding folders with monitor input

Path Finder

Splunk still scans those directories evaluating each file within against the whitelist & blacklist. I don't want Splunk to open those folders in the first place.

Highlighted

Re: Excluding folders with monitor input

Super Champion

Please mention what version of splunk you are using.

0 Karma
Highlighted

Re: Excluding folders with monitor input

Splunk Employee
Splunk Employee

Yes, please do. It makes a difference in behavior.

0 Karma
Highlighted

Re: Excluding folders with monitor input

Path Finder

Don't know how I forgot to include that. It's Splunk 4.1.2 on Windows 2008 R2.

0 Karma
Highlighted

Re: Excluding folders with monitor input

Splunk Employee
Splunk Employee

Is it reasonable for you to have a parallel directory containing symlinks to the subdirs you wish the monitor?

For example: logsyms/

    a -> /logs/a

    b -> /logs/b

    c -> /logs/c

    d -> /logs/d

This would do what you want, as you would be monitoring the logsyms directory, and DONTINDEX* wouldn't come into play. We could even make the 'source' field look like it's coming from /logs, if you need it.

Whether or not this works for you depends more on the specifics of what those subdirectory names are...