Getting Data In

When universal forwarder using wildcard monitor statements over deep file systems

Explorer

Hi

I read a post saying "Using wildcard monitor statements over deep file systems has a significant performance impact, so if this can be avoided it would be of benefit."
I'd like to better understand what that exactly means? What kind of "performance impact" it is, cpu, memory, disk, IO?

We have a UF 6.5 running on a Linux box, monitor a folder with about 460 files. The folder has 8 levels sub-folders, then come log files. Is this a DEEP file systems?

When I put the wildcard at the second level of sub-folder, monitor this whole folder tree in one stanza, it shows huge memory consumption percentage, and the log server closes to freezing.
When I specify every individual log file in its own stanza without using wildcard, everything works well without any performance issue.
The issue is, the second level of sub-folder names are dynamic, we have to use an ad-hoc script to manually build configuration file for all directories/files every day. We'd really like a better solution to avoid this daily manual intervention.

Which makes me doubt, when UF monitors one big folder tree, does it process them all in one thread?

Any other explanation for this, and any solution?

Thanks...

0 Karma

Revered Legend

On the box where you're monitoring with wildcard, run following CLI command and see how many files/directories are being monitored.

$SPLUNK_HOME/bin/splunk list monitor

Monitoring a directly with 8 subdir level and 460 files is not DEEP, but if there are too many subdirectories in those 8 levels and because your wildcard is create in such a way that Splunk have to traverse through all those huge file system, then you would see high CPU usage on that box.

What are the pattern/name of that dynamically named 2nd sub directory? May be a better written wildcard would help here.

0 Karma

Ultra Champion

-- We have a UF 6.5 running on a Linux box, monitor a folder with about 460 files. The folder has 8 levels sub-folders, then come log files. Is this a DEEP file systems?

Nope, it's not a deep file system - with 460 files you have nothing to worry about. You are good to go.

0 Karma

Explorer

Again regarding the "deep file systems",

If I create a bunch of symbolic links, try to move the the second level of dynamic sub-folder names to the deepest level (maybe the closest or second closest subfolder to the leaf/log files), then, I put the wildcards there?

Could it be a solution? Will it still be a wildcards on "deep file systems" case?

And if I group this 460 files into a few (say 4-10) stanzas, rather than 460 or a single big one, by using symlinks somehow. Could it be a solution?
Sorry it is not easy for me try it out myself.

0 Karma

Contributor

Hi,

I am not sure why is the wild card impacting the performance, we are using wild cards in many configurations but never got into this scenario, please check below docs and proably if you can change the wildcards.

https://docs.splunk.com/Documentation/Splunk/latest/Data/Specifyinputpathswithwildcards
0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!