Splunk Search

Monitor a directory and using whitelisting

jeffwarn
Explorer

I have a logging share right on the splunk server where a number of webservers write a few logs to. The structure more or less would look like:

(There are a number of "web" servers such as web-02, -03 ..etc)

/opt/var/log/httpd/web-01.domain.com/access.log
/opt/var/log/httpd/web-01.domain.com/access.log.1.gz
/opt/var/log/httpd/web-01.domain.com/access.log.2.gz
/opt/var/log/httpd/web-01.domain.com/tmp.dmp

(There are a number of "app" servers such as app-02 , -03, ..etc)

/opt/var/log/httpd/app-01.domain.com/access.log
/opt/var/log/httpd/app-01.domain.com/access.log.1.gz
/opt/var/log/httpd/app-01.domain.com/access.log.2.gz
/opt/var/log/httpd/app-01.domain.com/tmp.dmp

Note: the logs in each directory that I want to index are access.log , error.log, rewrite.log ..etc . Basically anything that is a current *.log file.

What I want to do is only index the "web" servers and only the uncompressed *.log files. It should ignore the app server directories.

I setup an input with the following:

Host - regex on path Hostname (regex): /opt/var/log/httpd/([^/]+)/

Advanced Options Whitelist: I've tried the following: (all variations of that without escaping /)

\/web-\d+\.domain\.com\/\w+\.log$

web-*\/*.log$

web-\d+\.*\/\w+\.log$

web-*\.log$

When I save the input and after a minute or so, the "Number of files" on the Input summary page shows 180 when it should only be indexing 24 files.

Is that number on the Input summary page accurate? Is that the actual number of files that are being indexed, even after whitelisting?

If so, what am I missing here?

Tags (3)
0 Karma

amrit
Splunk Employee
Splunk Employee

Sorry - as you've noted, the traditional ways of listing monitor inputs are a bit buggy in recent versions. The REST endpoint provides a much clearer view. You can get a summarized/realtime-ish view of the endpoint via the script @ http://blogs.splunk.com/2011/01/02/did-i-miss-christmas-2/

0 Karma

jeffwarn
Explorer

I found another article which shed light on the page: http://SPLUNKSERVER:8089/services/admin/inputstatus/TailingProcessor:FileStatus which shows a good indication of what is being processed. I suppose that "number of files" field on the inputs summary page does not take into account any filtered items.

Using the pattern: .-web-\d+./*.log$ seemed to work just fine. It was just driving me nuts thinking I was indexing all these other files when I wasn't.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

what is the path/pattern you entered to be monitored? Did you enter each individual directory, or specify /opt/var/log/httpd/web-*/*.log, or something else?

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...