<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to extract sourcetype and index with a regex from the monitor path directory structure? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447495#M126848</link>
    <description>&lt;P&gt;Like this:&lt;/P&gt;

&lt;P&gt;In inputs.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;#apache or nginx
[monitor:///apps/.../.../.../*.log.*]
sourcetype = apache_or_nginx_temp
index = apache_or_nginx_temp
blacklist = \.(zip|gz)$
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;In props.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[apache_or_nginx_temp]
TRANSFORMS-overrides_from_path = index_from_path, sourcetype_from_path
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;In transforms.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[index_from_path]
SOURCE_KEY=source
REGEX = (?:/[^/]+){1}/([^/]+)/ 
DEST_KEY =_MetaData:Index
FORMAT = $1

[sourcetype_from_path]
SOURCE_KEY=source
REGEX = (?:/[^/]+){2}/([^/]+)/([^/]+)/
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::$1:$2
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;You must deploy this to the first full instance(s) of Splunk that handles the events (usually either the HF-tier, if you use this, or your Indexer tier), restart all Splunk instances there, send in new events (old events will stay broken), then test using _index_earliest=-5m to be absolutely certain that you are only examining the newly indexed events.&lt;/P&gt;</description>
    <pubDate>Wed, 30 Sep 2020 01:02:57 GMT</pubDate>
    <dc:creator>woodcock</dc:creator>
    <dc:date>2020-09-30T01:02:57Z</dc:date>
    <item>
      <title>How to extract sourcetype and index with a regex from the monitor path directory structure?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447493#M126846</link>
      <description>&lt;P&gt;My end goal is to extract the sourcetype and index with a regex from the monitor path at runtime based on a lookup from the directory structure.&lt;/P&gt;

&lt;P&gt;For example in the case of apache&lt;BR /&gt;
actual monitor path will look like: &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/apps/apache/http/access/http-access.log
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;OR&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/apps/nginx/http/access/http-error.log
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;input.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; #apache or nginx
  [monitor:///apps/.../.../.../*.log.*]
  sourcetype = ( REGEX = ^source::(?:/[^/]+){1}/([^/]+)/ ): ( REGEX = ^source::(?:/[^/]+){2}/([^/]+)/ )
  index =  (REGEX = ^source::(?:/[^/]+){0}/([^/]+)/ )
  blacklist = \.(zip|gz)$
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;STRONG&gt;Desired output:&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;Splunk sends all apache access logs from &lt;CODE&gt;/app/apache/http/access/http-access.log with index=apache and sourcetype = http:access&lt;/CODE&gt; &lt;BR /&gt;
and splunk also sends all nginx error logs from &lt;CODE&gt;/apps/nginx/http/error/http-error.log with index=nginx and sourcetype=http:error&lt;/CODE&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 24 Jun 2019 16:45:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447493#M126846</guid>
      <dc:creator>psyched4splunk</dc:creator>
      <dc:date>2019-06-24T16:45:55Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract sourcetype and index with a regex from the monitor path directory structure?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447494#M126847</link>
      <description>&lt;P&gt;I'm not sure if you're able to do that, but even so I think an easier and cleaner solution would be to have two separate stanzas. That way you can troubleshoot and manage each different input/index/sourcetype individually.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[monitor:///apps/.../.../.../http-access.log]
sourcetype = http:access
index = apache

[monitor:///apps/.../.../.../http-error.log]
sourcetype = http:error
index = nginx
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 24 Jun 2019 17:22:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447494#M126847</guid>
      <dc:creator>oscar84x</dc:creator>
      <dc:date>2019-06-24T17:22:47Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract sourcetype and index with a regex from the monitor path directory structure?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447495#M126848</link>
      <description>&lt;P&gt;Like this:&lt;/P&gt;

&lt;P&gt;In inputs.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;#apache or nginx
[monitor:///apps/.../.../.../*.log.*]
sourcetype = apache_or_nginx_temp
index = apache_or_nginx_temp
blacklist = \.(zip|gz)$
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;In props.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[apache_or_nginx_temp]
TRANSFORMS-overrides_from_path = index_from_path, sourcetype_from_path
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;In transforms.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[index_from_path]
SOURCE_KEY=source
REGEX = (?:/[^/]+){1}/([^/]+)/ 
DEST_KEY =_MetaData:Index
FORMAT = $1

[sourcetype_from_path]
SOURCE_KEY=source
REGEX = (?:/[^/]+){2}/([^/]+)/([^/]+)/
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::$1:$2
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;You must deploy this to the first full instance(s) of Splunk that handles the events (usually either the HF-tier, if you use this, or your Indexer tier), restart all Splunk instances there, send in new events (old events will stay broken), then test using _index_earliest=-5m to be absolutely certain that you are only examining the newly indexed events.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 01:02:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447495#M126848</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2020-09-30T01:02:57Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract sourcetype and index with a regex from the monitor path directory structure?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447496#M126849</link>
      <description>&lt;P&gt;I answered the question that you asked but I 100% agree with @oscar84x: DO NOT USE MY ANSWER, USE HIS!!!!!&lt;/P&gt;</description>
      <pubDate>Mon, 24 Jun 2019 17:29:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447496#M126849</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2019-06-24T17:29:23Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract sourcetype and index with a regex from the monitor path directory structure?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447497#M126850</link>
      <description>&lt;P&gt;Yes, in this case it makes sense to separate the stanza's.However I didn't in my example because I wanted to see if there was a way to not hard code it as you did here.&lt;/P&gt;</description>
      <pubDate>Mon, 24 Jun 2019 18:39:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447497#M126850</guid>
      <dc:creator>psyched4splunk</dc:creator>
      <dc:date>2019-06-24T18:39:26Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract sourcetype and index with a regex from the monitor path directory structure?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447498#M126851</link>
      <description>&lt;P&gt;Yes, it is possible, but not advisable; see my answer.&lt;/P&gt;</description>
      <pubDate>Mon, 24 Jun 2019 18:52:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447498#M126851</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2019-06-24T18:52:43Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract sourcetype and index with a regex from the monitor path directory structure?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447499#M126852</link>
      <description>&lt;P&gt;Yes I found useful and accepted your solution in regards to using a HF. However this is a separate question in which I want to know can I just set the sourcetype and index = regex like below?&lt;BR /&gt;
&lt;STRONG&gt;I don't see much documentation on how to do this but I guess I don't understand why this is unattainable?&lt;/STRONG&gt; &lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;input.conf&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;PRE&gt;&lt;CODE&gt;  #apache or nginx
   [monitor:///apps/.../.../.../*.log.*]
   sourcetype = ( REGEX = ^source::(?:/[^/]+){1}/([^/]+)/ ): ( REGEX = ^source::(?:/[^/]+){2}/([^/]+)/ )
   index =  (REGEX = ^source::(?:/[^/]+){0}/([^/]+)/ )
   blacklist = \.(zip|gz)$
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Your solution here is still referencing a HF and that's not what I'm inquiring about.&lt;/P&gt;</description>
      <pubDate>Mon, 24 Jun 2019 23:09:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447499#M126852</guid>
      <dc:creator>psyched4splunk</dc:creator>
      <dc:date>2019-06-24T23:09:37Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract sourcetype and index with a regex from the monitor path directory structure?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447500#M126853</link>
      <description>&lt;P&gt;"I'm not sure if you're able to do that"...&lt;/P&gt;

&lt;P&gt;Yeah I don't see much documentation on how to do this but I guess I don't understand why this is unattainable?&lt;/P&gt;</description>
      <pubDate>Mon, 24 Jun 2019 23:11:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447500#M126853</guid>
      <dc:creator>psyched4splunk</dc:creator>
      <dc:date>2019-06-24T23:11:49Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract sourcetype and index with a regex from the monitor path directory structure?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447501#M126854</link>
      <description>&lt;P&gt;@woodcock 's answer is technically the correct way to do what you're asking for, regardless of whether it's on an HF or an indexer.  If the UFs are sending to a HF first, then props.conf &amp;amp; transforms.conf settings have to be on the HF.  If the UFs are sending to the indexers directly, you have put props.conf &amp;amp; transforms.conf settings on an indexer.&lt;/P&gt;

&lt;P&gt;If you want to know if you can do this directly from inputs.conf with regex, the answer is NO.&lt;/P&gt;

&lt;P&gt;I agree with @woodcock &amp;amp; @oscar84x , don't do it this way, just make stanzas in inputs.conf to correspond to your apache &amp;amp; nginx log paths.&lt;/P&gt;

&lt;P&gt;I would use slightly different versions from @oscar84x :&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[monitor:///apps/apache/*/*/http-access.log]
sourcetype = http:access
index = apache

[monitor:///apps/nginx/*/*/http-error.log]
sourcetype = http:error
index = nginx
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 25 Jun 2019 23:00:27 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447501#M126854</guid>
      <dc:creator>jnudell_2</dc:creator>
      <dc:date>2019-06-25T23:00:27Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract sourcetype and index with a regex from the monitor path directory structure?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447502#M126855</link>
      <description>&lt;P&gt;NO; there is no capability in Splunk for this.  See the dox:&lt;BR /&gt;
&lt;A href="https://docs.splunk.com/Documentation/Splunk/latest/Admin/Inputsconf"&gt;https://docs.splunk.com/Documentation/Splunk/latest/Admin/Inputsconf&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 29 Jun 2019 16:29:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-sourcetype-and-index-with-a-regex-from-the/m-p/447502#M126855</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2019-06-29T16:29:51Z</dc:date>
    </item>
  </channel>
</rss>

