Getting Data In

Whitelisting and blacklisting input files

Communicator

Hello, I have a directory, say "foo" with several logs. For example, I have three kind of logs and their names are (last two digits changes every hour):

aaaa.log.2010-12-15-00.gz aaaa.log.2010-12-16-15 aaaa.log.2010-12-16-16

bbbb.log.2010-12-15-00.gz bbbb.log.2010-12-16-15 bbbb.log.2010-12-16-16

cccc.log.2010-12-15-00.gz cccc.log.2010-12-16-15 cccc.log.2010-12-16-16

What I want to achieve is to log every aaaa bbbb cccc file assigning different sourcetypes and to avoid .gz files (that are rotaed files).

Is that possible? Thanks in advance,

Luca Caldiero Consoft Sistemi S.p.A.

Tags (1)
0 Karma
1 Solution

Builder

Splunk uses a very easy whitelist/blacklist setup for the inputs.conf file.

Edit the inputs.conf for the app you're working in, (SPLUNK-HOME/etc/apps/search/local/inputs.conf would be the path for the default search app)

You would have a line normally that would probably look like:

[monitor:///blahdirector/blahdirectory2/]
disabled = false
index = main

in 2 ways to set the sourcetype would be to use either the inputs.conf and force the source to each aaaa bbbb cccc file with 3 different inputs, or use the props.conf to do it off a regular expression.

example for inputs.conf:

[monitor:///blahdirector/blahdirectory2/aaaa.log.*]
disabled = false
index = main
sourcetype = blahaaaa

If you have several different files in one directory, you're better off breaking out the files sourcetype by using the props.conf. This way, splunk will only read the directory monitor and only needs 1 blacklist / whitelist to know what to index, and the sourcetypes are set "after" the data is collected.

To set the sourcetype using props.conf:

[source::/blahdirector/blahdirectory2/aaaa.log.*]
sourcetype = blahaaaa

Now to ignore the archived files, you simply add a blacklist to the same input.conf,

Example for a full directory:

[monitor:///blahdirector/blahdirectory2/]
disabled = false
index = main
_blacklist = .*\.gz

and that input would now ignore all .gz files in that directory. Or you can do multiple extensions:

[monitor:///blahdirector/blahdirectory2/]
disabled = false
index = main
_blacklist = \.(gz|zip|bkz|arch|etc)$

Hope this helps!

View solution in original post

Builder

Splunk uses a very easy whitelist/blacklist setup for the inputs.conf file.

Edit the inputs.conf for the app you're working in, (SPLUNK-HOME/etc/apps/search/local/inputs.conf would be the path for the default search app)

You would have a line normally that would probably look like:

[monitor:///blahdirector/blahdirectory2/]
disabled = false
index = main

in 2 ways to set the sourcetype would be to use either the inputs.conf and force the source to each aaaa bbbb cccc file with 3 different inputs, or use the props.conf to do it off a regular expression.

example for inputs.conf:

[monitor:///blahdirector/blahdirectory2/aaaa.log.*]
disabled = false
index = main
sourcetype = blahaaaa

If you have several different files in one directory, you're better off breaking out the files sourcetype by using the props.conf. This way, splunk will only read the directory monitor and only needs 1 blacklist / whitelist to know what to index, and the sourcetypes are set "after" the data is collected.

To set the sourcetype using props.conf:

[source::/blahdirector/blahdirectory2/aaaa.log.*]
sourcetype = blahaaaa

Now to ignore the archived files, you simply add a blacklist to the same input.conf,

Example for a full directory:

[monitor:///blahdirector/blahdirectory2/]
disabled = false
index = main
_blacklist = .*\.gz

and that input would now ignore all .gz files in that directory. Or you can do multiple extensions:

[monitor:///blahdirector/blahdirectory2/]
disabled = false
index = main
_blacklist = \.(gz|zip|bkz|arch|etc)$

Hope this helps!

View solution in original post

Communicator

Yes, thanks a lot, I'll try your suggestions asap.
Luca.

0 Karma