Getting Data In

Struggling with inputs.conf and conflicting rules

Builder

Hi,

I'm struggling with an issue involving my old nemesis, inputs.conf rules :-). In this case, we have a catch-all rule on our apache servers' inputs.conf at the bottom that looks like

[monitor:///var/weblogs/]
crcSalt = <SOURCE>
whitelist = .../(access|error)\.log$
index = apache

which works fine. There are multiple directories for different applications (Apache virtual hosts) under /var/weblogs like /var/weblogs/foo/access.log and so on.

What I need to do is tell Splunk to make an exception for applications that start with "presell". As a normal regular expression this would look like "^/var/weblogs/presell.*/(access|error).log$".

I believe it used to be that monitor: lines inputs.conf only understood their special wildcards and no regular expression stuff but now it seems like there are some rules that allow some RE's to be understood if it's mixed with wildcards in the same segment (weird!). In theory I would then expect

[monitor:///var/weblogs/presell*/*(access|error).log]
crcSalt = <SOURCE>
index = presell

to work. It does not. Events are still captured but go to the default index (i.e. "apache"). That extra '*' in front of the grouping expression is in accordance with the new rules that say that it would be recognized as an RE since there is a wildcard in the same segment. I would have also expected

[monitor:///var/weblogs/presell*/access.log]
crcSalt = <SOURCE>
index = presell

to work. It does not. Also thought

[monitor:///var/weblogs/presell*/*.log]
crcSalt = <SOURCE>
index = presell

would give me mostly what I want (I don't really want to capture anything that might crop up there with a .log suffix...). Still sends these events to the 'apache' index.

I know I can't say

[monitor:///var/weblogs/presell*/]
crcSalt = <SOURCE>
index = presell
whitelist = .../(access|error)\.log$

because the implicit whitelist from the monitor: line conflicts with the explicit whitelist (and it doesn't work anyway).

The only things I've found that work are to explicitly list the directories and/or files. That is, either

[monitor:///var/weblogs/presellAppA/access.log]
crcSalt = <SOURCE>
index = presell

[monitor:///var/weblogs/presellAppA/error.log]
crcSalt = <SOURCE>
index = presell

[monitor:///var/weblogs/presellAppB/access.log]
crcSalt = <SOURCE>
index = presell

[monitor:///var/weblogs/presellAppB/error.log]
crcSalt = <SOURCE>
index = presell

OR

[monitor://var/weblogs/presellAppA/]
crcSalt = <SOURCE>
index = presell
whitelist = .../(access|error)\.log$

[monitor:///var/weblogs/presellAppB/]
crcSalt = <SOURCE>
index = presell
whitelist = .../(access|error)\.log$

both of which are undesirable because it means I still have to enumerate all applications that start with "presell", meaning that if a new one cropped up tomorrow, it would not be handled the way I want. That is, if I could truly match on "presell*".

Normally if I stare at stuff like this long enough I see something obvious that I'm doing wrong, but so far I've been unable to figure out why this isn't working the way I'd expect.

I had been using a 6.4.3 universal forwarder here (sending to 6.5.0 forwarders and indexers) and then moved to 6.5.0 universal forwarders. As expected, the different versions work the same.

Any idea what I'm doing wrong here?

Thanks

Mark

1 Solution

Legend

Did you tried to use blacklists in your inputs.conf after whitelists?
In this way you can exclude "presell" from the first stanza and create another dedicated stanza to these logs.

Otherwise You could override index in the index phase using a regex:

#etc/system/local/props.conf 
 [mysourcetype]
 TRANSFORMS-index = overrideindex

 # etc/system/local/transforms.conf 
 [overrideindex]
 DEST_KEY =_MetaData:Index
 REGEX = your_regex
 FORMAT = my_new_index

Bye.
Giuseppe

View solution in original post

Legend

Did you tried to use blacklists in your inputs.conf after whitelists?
In this way you can exclude "presell" from the first stanza and create another dedicated stanza to these logs.

Otherwise You could override index in the index phase using a regex:

#etc/system/local/props.conf 
 [mysourcetype]
 TRANSFORMS-index = overrideindex

 # etc/system/local/transforms.conf 
 [overrideindex]
 DEST_KEY =_MetaData:Index
 REGEX = your_regex
 FORMAT = my_new_index

Bye.
Giuseppe

View solution in original post

Builder

I'm embarrassed to admit that I hadn't considered blacklists to solve this. This was still quite a pain to resolve even with the blacklist but it did help a lot. (Why can't Splunk just use regular expressions in the monitor line now?) I had a problem with constructing a monitor line that would properly match files in any presell* directory -- problems I don't think I should have had but I finally landed on the following which seems to work.

[monitor:///var/weblogs/presell*/*(access|error).log]
crcSalt = <SOURCE>
index = presell

[monitor:///home/splunk/weblogs/]
crcSalt = <SOURCE>
whitelist = /(access|error)\.log$
blacklist = /presell
index = apache

The docs seem to say that

[monitor:///var/weblogs/presell*/(access|error)*.log]

would work to trigger Splunk to recognize both its own wildcard and then a PCRE in the final segment, but that never seemed to work, but

[monitor:///var/weblogs/presell*/*(access|error).log]

does. The '*' in the final segment is completely superfluous other than to tell Splunk to honor the PCRE.

Thanks!

0 Karma