Getting Data In

Tailing processor and rsync (dot files) - blacklist?

Contributor

Easy one, this, but I can't seem to get it right.

I'm monitoring a series of directories which are rsync'd from other servers. Splunk, being ever so efficient, is managing to index the . files that rsync creates, as well as the files after they arrive. This has resulted in rather a lot of unnecessary data.

The answer, to me, should be either whitelists or blacklists.

For one of the directories, I can whitelist, as the files are all "blah.log" and thus "blah.log$" should work fine.

However, in other directories the files are named all sorts of things, and there's no easy regex to whitelist. So a blacklist should do the trick. But I can't seem to get a regex working for "any file starting with a ."

Hints?

Tags (1)
0 Karma
1 Solution

Contributor

There was something definitely amiss with the ability to parse recursive directories and use whitelist/blacklists, so I've gone ahead and created a monitor stanza in my inputs.conf for each of the 8 files. That was the only thing that got Splunk to actually show the content of those files in a query.

View solution in original post

0 Karma

Contributor

There was something definitely amiss with the ability to parse recursive directories and use whitelist/blacklists, so I've gone ahead and created a monitor stanza in my inputs.conf for each of the 8 files. That was the only thing that got Splunk to actually show the content of those files in a query.

View solution in original post

0 Karma

Splunk Employee
Splunk Employee
blacklist = /\.[^/]+$

should do it

Motivator

What does your current regex look like? Make sure you're not forgetting to put a slash in front of the dot, or it will think it's a wildcard.

Have you tried just:

blacklist=^\.

(For older versions of Splunk, use _blacklist instead of blacklist)

0 Karma

Contributor

I've put in gkanapathy's for now, but, I think something is wrong with my whitelist -- is there any potential interaction between whitelists and monitoring directories which have sub-directories (and it's in the sub-directories where my files are)?

I now have:


[monitor:///Volumes/A/b/c]
crcSalt = <SOURCE>
disabled = false
followTail = 0
host = strawberry
index = submarine
whitelist = submarine\.out$
sourcetype = log4j

However, my files are actually located in:

/Volumes/A/b/c/cluster3/data/instance/box-4/logs
/Volumes/A/b/c/cluster2/data/instance/box-3/logs
/Volumes/A/b/c/cluster2/data/instance/box-1/logs

And so on. A list of about 8 or so locations, but, since they're all under "c" I just pointed Splunk at that.

According to the inputstatus Tailing Processor URL, it's found "c" and some files in "c" which did not match the whitelist, but there's no indication that data in the rest of the path, and it's definitely not in the index (yesterday's data is, before I made this whitelist change).

0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes and swag!