Getting Data In

Tailing processor and rsync (dot files) - blacklist?

howyagoin
Contributor

Easy one, this, but I can't seem to get it right.

I'm monitoring a series of directories which are rsync'd from other servers. Splunk, being ever so efficient, is managing to index the . files that rsync creates, as well as the files after they arrive. This has resulted in rather a lot of unnecessary data.

The answer, to me, should be either whitelists or blacklists.

For one of the directories, I can whitelist, as the files are all "blah.log" and thus "blah.log$" should work fine.

However, in other directories the files are named all sorts of things, and there's no easy regex to whitelist. So a blacklist should do the trick. But I can't seem to get a regex working for "any file starting with a ."

Hints?

Tags (1)
0 Karma
1 Solution

howyagoin
Contributor

There was something definitely amiss with the ability to parse recursive directories and use whitelist/blacklists, so I've gone ahead and created a monitor stanza in my inputs.conf for each of the 8 files. That was the only thing that got Splunk to actually show the content of those files in a query.

View solution in original post

0 Karma

howyagoin
Contributor

There was something definitely amiss with the ability to parse recursive directories and use whitelist/blacklists, so I've gone ahead and created a monitor stanza in my inputs.conf for each of the 8 files. That was the only thing that got Splunk to actually show the content of those files in a query.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee
blacklist = /\.[^/]+$

should do it

southeringtonp
Motivator

What does your current regex look like? Make sure you're not forgetting to put a slash in front of the dot, or it will think it's a wildcard.

Have you tried just:

blacklist=^\.

(For older versions of Splunk, use _blacklist instead of blacklist)

0 Karma

howyagoin
Contributor

I've put in gkanapathy's for now, but, I think something is wrong with my whitelist -- is there any potential interaction between whitelists and monitoring directories which have sub-directories (and it's in the sub-directories where my files are)?

I now have:


[monitor:///Volumes/A/b/c]
crcSalt = <SOURCE>
disabled = false
followTail = 0
host = strawberry
index = submarine
whitelist = submarine\.out$
sourcetype = log4j

However, my files are actually located in:

/Volumes/A/b/c/cluster3/data/instance/box-4/logs
/Volumes/A/b/c/cluster2/data/instance/box-3/logs
/Volumes/A/b/c/cluster2/data/instance/box-1/logs

And so on. A list of about 8 or so locations, but, since they're all under "c" I just pointed Splunk at that.

According to the inputstatus Tailing Processor URL, it's found "c" and some files in "c" which did not match the whitelist, but there's no indication that data in the rest of the path, and it's definitely not in the index (yesterday's data is, before I made this whitelist change).

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...