<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Monitor a directory and using whitelisting in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Monitor-a-directory-and-using-whitelisting/m-p/58108#M14193</link>
    <description>&lt;P&gt;I have a logging share right on the splunk server where a number of webservers write a few logs to. The structure more or less would look like:&lt;/P&gt;

&lt;P&gt;(There are a number of "web" servers such as web-02, -03 ..etc)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/opt/var/log/httpd/web-01.domain.com/access.log
/opt/var/log/httpd/web-01.domain.com/access.log.1.gz
/opt/var/log/httpd/web-01.domain.com/access.log.2.gz
/opt/var/log/httpd/web-01.domain.com/tmp.dmp
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;(There are a number of "app" servers such as app-02 , -03, ..etc)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/opt/var/log/httpd/app-01.domain.com/access.log
/opt/var/log/httpd/app-01.domain.com/access.log.1.gz
/opt/var/log/httpd/app-01.domain.com/access.log.2.gz
/opt/var/log/httpd/app-01.domain.com/tmp.dmp
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Note: the logs in each directory that I want to index are access.log , error.log, rewrite.log ..etc . Basically anything that is a current *.log file.&lt;/P&gt;

&lt;P&gt;What I want to do is only index the "web" servers and only the uncompressed *.log files. It should ignore the app server directories.&lt;/P&gt;

&lt;P&gt;I setup an input with the following:&lt;/P&gt;

&lt;P&gt;Host - regex on path
Hostname (regex): /opt/var/log/httpd/([^/]+)/&lt;/P&gt;

&lt;P&gt;Advanced Options
Whitelist:  I've tried the following: (all variations of that without escaping /)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;\/web-\d+\.domain\.com\/\w+\.log$

web-*\/*.log$

web-\d+\.*\/\w+\.log$

web-*\.log$
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;When I save the input and after a minute or so, the "Number of files" on the Input summary page shows 180 when it should only be indexing 24 files.&lt;/P&gt;

&lt;P&gt;Is that number on the Input summary page accurate? Is that the actual number of files that are being indexed, even after whitelisting?&lt;/P&gt;

&lt;P&gt;If so, what am I missing here?&lt;/P&gt;</description>
    <pubDate>Sat, 12 Mar 2011 05:25:31 GMT</pubDate>
    <dc:creator>jeffwarn</dc:creator>
    <dc:date>2011-03-12T05:25:31Z</dc:date>
    <item>
      <title>Monitor a directory and using whitelisting</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Monitor-a-directory-and-using-whitelisting/m-p/58108#M14193</link>
      <description>&lt;P&gt;I have a logging share right on the splunk server where a number of webservers write a few logs to. The structure more or less would look like:&lt;/P&gt;

&lt;P&gt;(There are a number of "web" servers such as web-02, -03 ..etc)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/opt/var/log/httpd/web-01.domain.com/access.log
/opt/var/log/httpd/web-01.domain.com/access.log.1.gz
/opt/var/log/httpd/web-01.domain.com/access.log.2.gz
/opt/var/log/httpd/web-01.domain.com/tmp.dmp
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;(There are a number of "app" servers such as app-02 , -03, ..etc)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/opt/var/log/httpd/app-01.domain.com/access.log
/opt/var/log/httpd/app-01.domain.com/access.log.1.gz
/opt/var/log/httpd/app-01.domain.com/access.log.2.gz
/opt/var/log/httpd/app-01.domain.com/tmp.dmp
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Note: the logs in each directory that I want to index are access.log , error.log, rewrite.log ..etc . Basically anything that is a current *.log file.&lt;/P&gt;

&lt;P&gt;What I want to do is only index the "web" servers and only the uncompressed *.log files. It should ignore the app server directories.&lt;/P&gt;

&lt;P&gt;I setup an input with the following:&lt;/P&gt;

&lt;P&gt;Host - regex on path
Hostname (regex): /opt/var/log/httpd/([^/]+)/&lt;/P&gt;

&lt;P&gt;Advanced Options
Whitelist:  I've tried the following: (all variations of that without escaping /)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;\/web-\d+\.domain\.com\/\w+\.log$

web-*\/*.log$

web-\d+\.*\/\w+\.log$

web-*\.log$
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;When I save the input and after a minute or so, the "Number of files" on the Input summary page shows 180 when it should only be indexing 24 files.&lt;/P&gt;

&lt;P&gt;Is that number on the Input summary page accurate? Is that the actual number of files that are being indexed, even after whitelisting?&lt;/P&gt;

&lt;P&gt;If so, what am I missing here?&lt;/P&gt;</description>
      <pubDate>Sat, 12 Mar 2011 05:25:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Monitor-a-directory-and-using-whitelisting/m-p/58108#M14193</guid>
      <dc:creator>jeffwarn</dc:creator>
      <dc:date>2011-03-12T05:25:31Z</dc:date>
    </item>
    <item>
      <title>Re: Monitor a directory and using whitelisting</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Monitor-a-directory-and-using-whitelisting/m-p/58109#M14194</link>
      <description>&lt;P&gt;what is the path/pattern you entered to be monitored? Did you enter each individual directory, or specify &lt;CODE&gt;/opt/var/log/httpd/web-*/*.log&lt;/CODE&gt;, or something else?&lt;/P&gt;</description>
      <pubDate>Sat, 12 Mar 2011 10:19:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Monitor-a-directory-and-using-whitelisting/m-p/58109#M14194</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2011-03-12T10:19:05Z</dc:date>
    </item>
    <item>
      <title>Re: Monitor a directory and using whitelisting</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Monitor-a-directory-and-using-whitelisting/m-p/58110#M14195</link>
      <description>&lt;P&gt;I found another article which shed light on the page: &lt;A href="http://SPLUNKSERVER:8089/services/admin/inputstatus/TailingProcessor:FileStatus"&gt;http://SPLUNKSERVER:8089/services/admin/inputstatus/TailingProcessor:FileStatus&lt;/A&gt; which shows a good indication of what is being processed. I suppose that "number of files" field on the inputs summary page does not take into account any filtered items.&lt;/P&gt;

&lt;P&gt;Using the pattern: .&lt;EM&gt;-web-\d+.&lt;/EM&gt;/*.log$  seemed to work just fine. It was just driving me nuts thinking I was indexing all these other files when I wasn't.&lt;/P&gt;</description>
      <pubDate>Mon, 14 Mar 2011 20:25:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Monitor-a-directory-and-using-whitelisting/m-p/58110#M14195</guid>
      <dc:creator>jeffwarn</dc:creator>
      <dc:date>2011-03-14T20:25:38Z</dc:date>
    </item>
    <item>
      <title>Re: Monitor a directory and using whitelisting</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Monitor-a-directory-and-using-whitelisting/m-p/58111#M14196</link>
      <description>&lt;P&gt;Sorry - as you've noted, the traditional ways of listing monitor inputs are a bit buggy in recent versions.  The REST endpoint provides a much clearer view.  You can get a summarized/realtime-ish view of the endpoint via the script @ &lt;A href="http://blogs.splunk.com/2011/01/02/did-i-miss-christmas-2/"&gt;http://blogs.splunk.com/2011/01/02/did-i-miss-christmas-2/&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 22 Apr 2011 17:38:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Monitor-a-directory-and-using-whitelisting/m-p/58111#M14196</guid>
      <dc:creator>amrit</dc:creator>
      <dc:date>2011-04-22T17:38:00Z</dc:date>
    </item>
  </channel>
</rss>

