<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Setting sourcetype with a complex regex - transforms.conf in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13401#M1237</link>
    <description>&lt;P&gt;Did some quick testing and your regex seems good.  Pleas post the corresponding &lt;CODE&gt;props.conf&lt;/CODE&gt; entries.  Keep in mind that splunk doesn't do recursive sourcetype matching.  For example, say your events come in with a &lt;CODE&gt;sourcetype::temp&lt;/CODE&gt;, and then you use a transformer to reassign the sourcetype to &lt;CODE&gt;sourcetype::my_st&lt;/CODE&gt;.  After re-assigning the sourcetype, Splunk will NOT look up the &lt;CODE&gt;[my_st]&lt;/CODE&gt; stanza for additional sourcetype-specific processing rules.  In other words, an inherit limitation in re-assigning sourcetypes like this that all events must be processed based on the initial sourcetype.&lt;/P&gt;</description>
    <pubDate>Fri, 28 May 2010 01:47:06 GMT</pubDate>
    <dc:creator>Lowell</dc:creator>
    <dc:date>2010-05-28T01:47:06Z</dc:date>
    <item>
      <title>Setting sourcetype with a complex regex - transforms.conf</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13398#M1234</link>
      <description>&lt;P&gt;I am using transforms.conf to pull the sourcetype from the source via a complex regex. It doesn't seem to be working, so I'm wondering if you are allowed to set sourcetype with multiple concatenated capture groups.&lt;/P&gt;

&lt;P&gt;The regex checks the source for many items in a big OR statement, so only one-two capture groups should ever return.  So, does something like $2$3$4$5$6 work?&lt;/P&gt;

&lt;P&gt;Or is the problem that I use a backreference in the regex?&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[set_sourcetype_for_applogs]
SOURCE_KEY = Metadata:Source
DEST_KEY = Metadata:Sourcetype
# regex: path/host_ then pull sourcetype from one of the following examples:
# HOST_app1_20100510000003_SOURCETYPE_1.log.1.gz =&amp;gt; SOURCETYPE
# HOST_SOURCE1-TYPE.201005100001.log.1.gz =&amp;gt; SOURCE-TYPE (removal of number)
# HOST_instance1-SOURCE-TYPE.201005100001.log.1.gz =&amp;gt; SOURCE-TYPE (removal of instance and optional number if instance is same as SOURCE)
# HOST_SOURCETYPE.201005102301.log.1.gz =&amp;gt; SOURCETYPE
# Is big OR statement, so can only ever be $2, $3$4, $5, or $6, so
# concatenate them all together so none are lost no matter which matches
REGEX = .*_(?:(\D+)\d?-(\1.*?)\.\d\d+|(\D+)\d(-.*?)\.\d\d+|.*_\d+_(.*)_|(.*?)\.\d\d+)
FORMAT = sourcetype::$2$3$4$5$6
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 13 May 2010 06:46:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13398#M1234</guid>
      <dc:creator>Jason</dc:creator>
      <dc:date>2010-05-13T06:46:11Z</dc:date>
    </item>
    <item>
      <title>Re: Setting sourcetype with a complex regex - transforms.conf</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13399#M1235</link>
      <description>&lt;P&gt;According to my teammates, this is not possible - that you must use a single capture group only:
FORMAT = $2&lt;/P&gt;

&lt;P&gt;Someone from Splunk, please correct me if multiple is possible.&lt;/P&gt;</description>
      <pubDate>Thu, 13 May 2010 07:35:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13399#M1235</guid>
      <dc:creator>Jason</dc:creator>
      <dc:date>2010-05-13T07:35:46Z</dc:date>
    </item>
    <item>
      <title>Re: Setting sourcetype with a complex regex - transforms.conf</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13400#M1236</link>
      <description>&lt;P&gt;This is possible, but only in index-time transforms, which is what you are using. Using multiple capture groups is not possible with search time extractions.&lt;/P&gt;</description>
      <pubDate>Thu, 13 May 2010 14:05:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13400#M1236</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2010-05-13T14:05:47Z</dc:date>
    </item>
    <item>
      <title>Re: Setting sourcetype with a complex regex - transforms.conf</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13401#M1237</link>
      <description>&lt;P&gt;Did some quick testing and your regex seems good.  Pleas post the corresponding &lt;CODE&gt;props.conf&lt;/CODE&gt; entries.  Keep in mind that splunk doesn't do recursive sourcetype matching.  For example, say your events come in with a &lt;CODE&gt;sourcetype::temp&lt;/CODE&gt;, and then you use a transformer to reassign the sourcetype to &lt;CODE&gt;sourcetype::my_st&lt;/CODE&gt;.  After re-assigning the sourcetype, Splunk will NOT look up the &lt;CODE&gt;[my_st]&lt;/CODE&gt; stanza for additional sourcetype-specific processing rules.  In other words, an inherit limitation in re-assigning sourcetypes like this that all events must be processed based on the initial sourcetype.&lt;/P&gt;</description>
      <pubDate>Fri, 28 May 2010 01:47:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13401#M1237</guid>
      <dc:creator>Lowell</dc:creator>
      <dc:date>2010-05-28T01:47:06Z</dc:date>
    </item>
    <item>
      <title>Re: Setting sourcetype with a complex regex - transforms.conf</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13402#M1238</link>
      <description>&lt;P&gt;Don't you want:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[set_sourcetype_for_applogs]
SOURCE_KEY = MetaData:Source
DEST_KEY = MetaData:Sourcetype
....
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;(Note the uppercase "D" in &lt;CODE&gt;MetaData&lt;/CODE&gt;)&lt;/P&gt;

&lt;P&gt;Go look at your own post.  &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;&lt;A href="http://answers.splunk.com/questions/2464/forcing-source-or-sourcetype-just-isnt-working" rel="nofollow"&gt;Forcing source or sourcetype just isn't working!&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;Don't you just hate it when you miss that kind of stuff.  What I wouldn't give for some sort of validating parser...  The funny thing is that I looked to see if you had the case correct, and I missed it too.&lt;/P&gt;</description>
      <pubDate>Fri, 28 May 2010 23:21:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13402#M1238</guid>
      <dc:creator>Lowell</dc:creator>
      <dc:date>2010-05-28T23:21:11Z</dc:date>
    </item>
    <item>
      <title>Re: Setting sourcetype with a complex regex - transforms.conf</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13403#M1239</link>
      <description>&lt;P&gt;I've been working with the above in a slightly modified form. I'm collecting the logs from the directory /var/log/novell. The log names are things like /var/log/novell/foo.log, /var/log/novell/bar00.log and /var/log/novell/foo.bar.log. What I wanted to grab and use as the sourcetype was the foo, bar and foo.bar portion of the filenames respectively. &lt;BR /&gt;
Here's what I have in transforms.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[set_sourcetype_for_mcommunity_logs]
SOURCE_KEY = MetaData:Source
DEST_KEY = MetaData:Sourcetype
REGEX = .*/novell/(\S+)(\d+)?\.log(\.\d+)?
FORMAT = sourcetype::$1
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Here's what I have in props.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[source::.../var/log/novell/*]
TRANSFORMS-set_sourcetype = set_sourcetype_for_mcommunity_logs
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 29 Feb 2012 16:25:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13403#M1239</guid>
      <dc:creator>colinj</dc:creator>
      <dc:date>2012-02-29T16:25:48Z</dc:date>
    </item>
    <item>
      <title>Re: Setting sourcetype with a complex regex - transforms.conf</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13404#M1240</link>
      <description>&lt;P&gt;@colinj - Is your sourcetyping working with the mentioned props.conf and transforms.conf ? &lt;BR /&gt;
Please confirm.&lt;/P&gt;</description>
      <pubDate>Mon, 03 Feb 2020 06:42:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13404#M1240</guid>
      <dc:creator>navidnaddimulla</dc:creator>
      <dc:date>2020-02-03T06:42:52Z</dc:date>
    </item>
    <item>
      <title>Re: Setting sourcetype with a complex regex - transforms.conf</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13405#M1241</link>
      <description>&lt;P&gt;OK, so what Lowell said above is exactly what I'm trying to accomplish.  I have logs coming from a docker container, and I would like to use a regex to tell splunk that the sourcetype of that log entry is access_combined.  I've setup props and a transform, and I see the source type being changed to access_combined but it's not parsing the fields.  After looking at the access_combined regex, I don't want to try to figure this out myself.  is there some way that I can take logs from source::&lt;EM&gt;whatever&lt;/EM&gt; and based on a regex, somehow get them to be processed by the access_combined sourcetype?&lt;/P&gt;

&lt;P&gt;I'm using the docker logging driver for splunk at this time, so I can't set the source type before it hits splunk, at least not that I'm aware of.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 04:26:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/13405#M1241</guid>
      <dc:creator>mpflugfelder</dc:creator>
      <dc:date>2020-09-30T04:26:02Z</dc:date>
    </item>
    <item>
      <title>Re: Setting sourcetype with a complex regex - transforms.conf</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/577819#M201358</link>
      <description>&lt;P&gt;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/131740"&gt;@mpflugfelder&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Did you ever find an answer to this? I'm running into the EXACT same scenario with my Openshift environment. Seeing that explination answers a lot about what I'm seeing, as I can't seem to get it to "re-sourcetype" my data. I do see a potential answer in CLONE_SOURCETYPE, but I am afraid that will double up the events, and I'd want to discard the original (and only if the second one contained all the metadata from the first).&lt;/P&gt;</description>
      <pubDate>Wed, 08 Dec 2021 21:59:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Setting-sourcetype-with-a-complex-regex-transforms-conf/m-p/577819#M201358</guid>
      <dc:creator>AHBrook</dc:creator>
      <dc:date>2021-12-08T21:59:00Z</dc:date>
    </item>
  </channel>
</rss>

