Getting Data In

fschange with recurse=true - unexpected results from whitelist

jbidinger
Explorer

I'm trying to monitor the xml files that define a Solaris service. These files live under /var/svc/manifest/.../*.xml.

/var/svc/manifest/application/stosreg.xml
/var/svc/manifest/application/management/wbem.xml
/var/svc/manifest/network/rpc/rstat.xml
/var/svc/manifest/network/rpc/bind.xml
/var/svc/manifest/network/rpc/wall.xml
/var/svc/manifest/platform/sun4u/oplhpd.xml
/var/svc/manifest/milestone/multi-user.xml
/var/svc/manifest/system/console-login.xml
/var/svc/manifest/system/mdmonitor.xml

I have the following defined in my inputs.conf:

[filter:whitelist:xml_files]
regex1 = \.xml$

[filter:blacklist:terminal-blacklist]
regex1 = .?

[fschange:/var/svc/manifest]
sourcetype = solaris_etc
index = fileint
filters = xml_files, terminal-blacklist
disabled = false
recurse = true
pollPeriod = 300
fullEvent = true
sendEventMaxSize = -1

I'm using the whitelist regex for another fschange and it does match the xml files. The problem I'm having is that when recurse=true it doesn't appear to match anymore. I've tried variations such as .*\/.*\.xml, etc and nothing seems to help.

According to this page in the docs: http://www.splunk.com/base/Documentation/4.1.4/Admin/Monitorchangestoyourfilesystem it should be working.

Any help is greatly appreciated.

  • Jon
Tags (2)
1 Solution

gkanapathy
Splunk Employee
Splunk Employee

The problem is that fschange whitelists and blacklists don't work the way you (or probably anyone else) would want them to work.

If you don't recurse, everything is fine, as files in the current directory that match the path get indexed, and others don't.

The problem when you recurse is that directories underneath get the same whitelists and blacklists applied, and any directory that gets blacklisted is skipped, i.e., files within such a directory are all blacklisted.

I am not sure if this will work, but you can try adding a filter:

[filter:whitelist:directories]
regex1 = \/$

and adding that to your filters list. I have a feeling that it won't work, but if it does, you're okay. If it doesn't, you're kind of out of luck unless you can come up with some regex to distinguish between files and directories (or have a list of valid subdirectories), e.g., if you assume files have a . in the name, maybe:

[filter:whitelist:directories]
regex1 = /[^/\.]+$

Note that this problem applies recursively to subdirectories as well. I concede that it is bad.

This behavior is accurately noted and described in the docs: http://www.splunk.com/base/Documentation/latest/Admin/Monitorchangestoyourfilesystem#Configure_the_f...

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

Given the way the whitelisting works, and since it appears that you're trying to index the full file, there is another way to get the result you want. You can use the regular log file monitoring rather than the fschange monitor to get the full file, with some settings for the source type. In inputs, you would:

[monitor:///var/svc/manifest/]
whitelist = \.xml$
sourcetype = solaris_etc
index = fileint

in props.conf:

[solaris_etc]
DATETIME_CONFIG = NONE
CHECK_METHOD = entire_md5
TRUNCATE = 0
LINE_BREAKER = (?!)

This should wind up looking the same, with the bonus that you won't have a poll period so the changes should be detected more quickly.

gkanapathy
Splunk Employee
Splunk Employee

The problem is that fschange whitelists and blacklists don't work the way you (or probably anyone else) would want them to work.

If you don't recurse, everything is fine, as files in the current directory that match the path get indexed, and others don't.

The problem when you recurse is that directories underneath get the same whitelists and blacklists applied, and any directory that gets blacklisted is skipped, i.e., files within such a directory are all blacklisted.

I am not sure if this will work, but you can try adding a filter:

[filter:whitelist:directories]
regex1 = \/$

and adding that to your filters list. I have a feeling that it won't work, but if it does, you're okay. If it doesn't, you're kind of out of luck unless you can come up with some regex to distinguish between files and directories (or have a list of valid subdirectories), e.g., if you assume files have a . in the name, maybe:

[filter:whitelist:directories]
regex1 = /[^/\.]+$

Note that this problem applies recursively to subdirectories as well. I concede that it is bad.

This behavior is accurately noted and described in the docs: http://www.splunk.com/base/Documentation/latest/Admin/Monitorchangestoyourfilesystem#Configure_the_f...

gkanapathy
Splunk Employee
Splunk Employee

I have another answer that I will post for you in a bit that I think will solve your problem.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

If you are trying to do full events, I also have another answer for you.

0 Karma

jbidinger
Explorer

In my particular case, it isn't realistic to try to whitelist all of the potential subdirectories since they are not necessarily defined.

Perhaps a potential feature would be the option of a "depth-first" match. Where only the files are compared against the regex.

I can see where both types of behaviors would be handy.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...