<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Performantly overriding sourcetype per event with new replacement string, not backreference? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481408#M134918</link>
    <description>&lt;P&gt;not sure if your comment is an answer ... &lt;BR /&gt;
can you elaborate on the problem you are trying to solve? what is it that you would like to achieve?&lt;/P&gt;</description>
    <pubDate>Sat, 09 Nov 2019 02:15:16 GMT</pubDate>
    <dc:creator>adonio</dc:creator>
    <dc:date>2019-11-09T02:15:16Z</dc:date>
    <item>
      <title>Performantly overriding sourcetype per event with new replacement string, not backreference?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481406#M134916</link>
      <description>&lt;P&gt;I know how to use Splunk 7.3.0 to overrride source type per event using a backreference. For example, given this snippet of incoming JSON Lines:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;"code":"red"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I can do this in &lt;CODE&gt;transforms.conf&lt;/CODE&gt;:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;REGEX = \"code\":\"([^\"]+)\"
FORMAT = sourcetype::$1
DEST_KEY = MetaData:Sourcetype
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Code "red" in the incoming JSON Lines event data sets the event source type to "red".&lt;/P&gt;

&lt;P&gt;But suppose I don't want to use the value of &lt;CODE&gt;code&lt;/CODE&gt; as the &lt;CODE&gt;sourcetype&lt;/CODE&gt;? Suppose I want to map each &lt;CODE&gt;code&lt;/CODE&gt; value to a completely different &lt;CODE&gt;sourcetype&lt;/CODE&gt; value? Perhaps each incoming &lt;CODE&gt;code&lt;/CODE&gt; value uniquely identifies a different source type, but the actual &lt;CODE&gt;code&lt;/CODE&gt; value is not Splunk-y enough to be a &lt;CODE&gt;sourcetype&lt;/CODE&gt; value? Although, I &lt;EM&gt;don't&lt;/EM&gt; want to get into &lt;CODE&gt;sourcetype&lt;/CODE&gt; naming conventions here.&lt;/P&gt;

&lt;P&gt;The only way I have thought of doing this so far is to create a stanza for each &lt;CODE&gt;code&lt;/CODE&gt; value. For example, in &lt;CODE&gt;transforms.conf&lt;/CODE&gt; (these &lt;CODE&gt;code&lt;/CODE&gt; and &lt;CODE&gt;sourcetype&lt;/CODE&gt; values are fictitious):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[set_sourcetype_test_red]
REGEX = \"code\":\"red\"
FORMAT = sourcetype::scarlet
DEST_KEY = MetaData:Sourcetype
[set_sourcetype_test_green]
REGEX = \"code\":\"green\"
FORMAT = sourcetype::emerald
DEST_KEY = MetaData:Sourcetype
[set_sourcetype_test_blue]
REGEX = \"code\":\"blue\"
FORMAT = sourcetype::aqua
DEST_KEY = MetaData:Sourcetype
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;and in &lt;CODE&gt;props.conf&lt;/CODE&gt;:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;TRANSFORMS-changesourcetype = set_sourcetype_test_red, set_sourcetype_test_green, set_sourcetype_test_blue
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Codes "red", "green", and "blue" become source types "scarlet", "emerald", and "aqua".&lt;/P&gt;

&lt;P&gt;I don't like this multi-stanza technique. I currently have only half a dozen or so source types in this context, but I might end up with many more.&lt;/P&gt;

&lt;P&gt;Can anyone suggest a more concise, more performant technique; say, a single stanza with a single regex? I can't see how to do it.&lt;/P&gt;

&lt;P&gt;For the purposes of this question:&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;The different &lt;CODE&gt;code&lt;/CODE&gt; values are all arriving at the same Splunk input (for example, TCP port)&lt;/LI&gt;
&lt;LI&gt;I know what all the &lt;CODE&gt;code&lt;/CODE&gt; values are (although, a fallback transform that uses a backreference for unexpected &lt;CODE&gt;code&lt;/CODE&gt; values would be useful)&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;I notice that the Splunk docs contain the &lt;A href="https://docs.splunk.com/Documentation/Splunk/7.3.2/ReleaseNotes/pcre2"&gt;PCRE2&lt;/A&gt; license, but the &lt;CODE&gt;transforms.conf&lt;/CODE&gt; docs don't appear to mention any PCRE2-specific functionality, and anyway, I'm not even sure whether PCRE2-level substitution features would be of help here.&lt;/P&gt;</description>
      <pubDate>Sat, 09 Nov 2019 00:58:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481406#M134916</guid>
      <dc:creator>Graham_Hanningt</dc:creator>
      <dc:date>2019-11-09T00:58:31Z</dc:date>
    </item>
    <item>
      <title>Re: Performantly overriding sourcetype per event with new replacement string, not backreference?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481407#M134917</link>
      <description>&lt;P&gt;Perhaps I'm trying too hard to be Splunk-y by attempting to map each of these incoming &lt;CODE&gt;code&lt;/CODE&gt; values to a different &lt;CODE&gt;sourcetype&lt;/CODE&gt;value. I &lt;EM&gt;could&lt;/EM&gt; simply forget about overriding the source type per event, set a fixed &lt;CODE&gt;sourcetype&lt;/CODE&gt;, and, in my searches, where I currently refer only to &lt;CODE&gt;sourcetype&lt;/CODE&gt;, refer instead to both &lt;CODE&gt;sourcetype&lt;/CODE&gt; and &lt;CODE&gt;code&lt;/CODE&gt;. (I didn't mention this in the question, but I typically use a transform to remove &lt;CODE&gt;code&lt;/CODE&gt; after using it to override &lt;CODE&gt;sourcetype&lt;/CODE&gt;.) I typically place such search snippets in macros, anyway, to isolate my dashboard Simple XML from such issues.&lt;/P&gt;

&lt;P&gt;Not overriding the source type would mean that, if the data is ingested by uploading from a file on my computer, the search that Splunk Web offers for the newly uploaded data will actually find results!&lt;/P&gt;</description>
      <pubDate>Sat, 09 Nov 2019 01:36:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481407#M134917</guid>
      <dc:creator>Graham_Hanningt</dc:creator>
      <dc:date>2019-11-09T01:36:55Z</dc:date>
    </item>
    <item>
      <title>Re: Performantly overriding sourcetype per event with new replacement string, not backreference?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481408#M134918</link>
      <description>&lt;P&gt;not sure if your comment is an answer ... &lt;BR /&gt;
can you elaborate on the problem you are trying to solve? what is it that you would like to achieve?&lt;/P&gt;</description>
      <pubDate>Sat, 09 Nov 2019 02:15:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481408#M134918</guid>
      <dc:creator>adonio</dc:creator>
      <dc:date>2019-11-09T02:15:16Z</dc:date>
    </item>
    <item>
      <title>Re: Performantly overriding sourcetype per event with new replacement string, not backreference?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481409#M134919</link>
      <description>&lt;P&gt;You could use &lt;CODE&gt;INGEST_EVAL&lt;/CODE&gt; with a &lt;CODE&gt;case&lt;/CODE&gt; statement to facilitate this.&lt;/P&gt;</description>
      <pubDate>Sat, 09 Nov 2019 23:29:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481409#M134919</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2019-11-09T23:29:20Z</dc:date>
    </item>
    <item>
      <title>Re: Performantly overriding sourcetype per event with new replacement string, not backreference?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481410#M134920</link>
      <description>&lt;P&gt;I've just submitted the following feedback on the Splunk 7.3.0 docs page for &lt;CODE&gt;transforms.conf&lt;/CODE&gt;:&lt;/P&gt;

&lt;HR /&gt;

&lt;P&gt;I've seen that Splunk docs cite the PCRE2 license, so I'd hoped that regex replacement in transforms.conf would support PCRE2 replacements. Apparently not :-(, hence this feedback.&lt;/P&gt;

&lt;P&gt;The following settings:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[set_sourcetype_test_pcre2]
REGEX = \"code\":\"(?&amp;lt;red&amp;gt;red)|(?&amp;lt;green&amp;gt;green)|(?&amp;lt;blue&amp;gt;blue)|(?&amp;lt;other&amp;gt;[^\"]+)\"
FORMAT = sourcetype::${red:+scarlet:}${green:+emerald:}${blue:+aqua:}
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;with input JSON Lines snippet such as:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;"code":"red"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;results in a sourcetype value of, literally:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;${red:+scarlet:}${green:+emerald:}${blue:+aqua:}
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;That is, regex processing in Splunk appears not to recognize the PCRE2 replacement syntax.&lt;/P&gt;

&lt;P&gt;Or perhaps I'm doing something wrong.&lt;/P&gt;

&lt;P&gt;Here's what I want to happen: if the code property value is "red", then set sourcetype to "scarlet"; if code "green", set sourcetype "emerald"; if code "blue", set sourcetype "aqua".&lt;/P&gt;

&lt;P&gt;For more details, see my related question in Splunk Answers, "&lt;A href="https://answers.splunk.com/answers/782356"&gt;Performantly overriding sourcetype per event with new replacement string, not backreference?&lt;/A&gt;".&lt;/P&gt;

&lt;HR /&gt;

&lt;P&gt;By "doing something wrong", I mean, for example: if the named capture group "red" is unset, then I want the replacement value to be an empty string, hence the lack of a string after the second colon; however, I'm unsure whether PCRE2 allows this; whether I need to specify "something" as the replacement string.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2019 05:45:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481410#M134920</guid>
      <dc:creator>Graham_Hanningt</dc:creator>
      <dc:date>2019-11-15T05:45:26Z</dc:date>
    </item>
    <item>
      <title>Re: Performantly overriding sourcetype per event with new replacement string, not backreference?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481411#M134921</link>
      <description>&lt;P&gt;Hi adonio,&lt;/P&gt;

&lt;P&gt;My &lt;EM&gt;question&lt;/EM&gt; includes an answer, but, as I wrote, I don't like the technique it uses. My first comment after the question describes a &lt;EM&gt;workaround&lt;/EM&gt;, rather than an answer: abandoning the idea of a granular &lt;CODE&gt;sourcetype&lt;/CODE&gt; field, and instead relying on a combination of a fixed, generic &lt;CODE&gt;sourcetype&lt;/CODE&gt; field in combination with a separate &lt;CODE&gt;code&lt;/CODE&gt; field.&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;can you elaborate on the problem you are trying to solve? &lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;I want to use a value in incoming JSON Lines data to set &lt;CODE&gt;sourcetype&lt;/CODE&gt; per event. The value in the incoming data and the &lt;CODE&gt;sourcetype&lt;/CODE&gt; are completely different.&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;what is it that you would like to achieve?&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;A more performant solution than the one I have now. Suppose I have 20 source types. Using my current technique, that means 20 separate stanzas in &lt;CODE&gt;transforms.conf&lt;/CODE&gt;. I'm hoping for something more elegant and concise; and I'm hoping that this also means "more performant" (faster; less index-time processing for the transform).&lt;/P&gt;

&lt;P&gt;I was hoping that PCRE2 replacement syntax might work; see my recent related comment on this question.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2019 05:53:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481411#M134921</guid>
      <dc:creator>Graham_Hanningt</dc:creator>
      <dc:date>2019-11-15T05:53:21Z</dc:date>
    </item>
    <item>
      <title>Re: Performantly overriding sourcetype per event with new replacement string, not backreference?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481412#M134922</link>
      <description>&lt;P&gt;Yes!&lt;/P&gt;

&lt;P&gt;This works:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[set_sourcetype]
INGEST_EVAL = sourcetype:=case(match(_raw, "\"code\":\"red\""), "scarlet", match(_raw, "\"code\":\"green\""), "emerald", match(_raw, "\"code\":\"blue\""), "aqua", true(), "other")
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Thank you for your answer. My apologies for this belated comment.&lt;/P&gt;

&lt;P&gt;I don't like the repetition of &lt;CODE&gt;match(_raw, ... )&lt;/CODE&gt; in my &lt;CODE&gt;case&lt;/CODE&gt; function, though.&lt;/P&gt;

&lt;P&gt;Here's a variation that extracts the &lt;CODE&gt;code&lt;/CODE&gt; value into &lt;CODE&gt;sourcetype&lt;/CODE&gt; in one transform, and then refers to that "temporary" &lt;CODE&gt;sourcetype&lt;/CODE&gt; in the &lt;CODE&gt;INGEST_EVAL&lt;/CODE&gt; in a second transform:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[get_sourcetype_from_code]
REGEX = \"code\":\"([^\"]+)\"
FORMAT = sourcetype::$1
DEST_KEY = MetaData:Sourcetype
[set_sourcetype]
INGEST_EVAL = sourcetype:=case(sourcetype=="red", "scarlet", sourcetype=="green", "emerald", sourcetype=="blue", "aqua", true(), sourcetype)
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;(Requires &lt;CODE&gt;props.conf&lt;/CODE&gt; to refer to the two transforms in sequence. For example: &lt;CODE&gt;TRANSFORMS-changesourcetype = get_sourcetype_from_code,set_sourcetype&lt;/CODE&gt;.)&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2019 07:51:27 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481412#M134922</guid>
      <dc:creator>Graham_Hanningt</dc:creator>
      <dc:date>2019-11-15T07:51:27Z</dc:date>
    </item>
    <item>
      <title>Re: Performantly overriding sourcetype per event with new replacement string, not backreference?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481413#M134923</link>
      <description>&lt;P&gt;Incidental observation: the example &lt;CODE&gt;set_sourcetype&lt;/CODE&gt; stanza in my previous comment (deliberately) doesn't specify a &lt;CODE&gt;REGEX&lt;/CODE&gt; setting. &lt;CODE&gt;splunkd&lt;/CODE&gt; reports this omission as an error:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;ERROR regexExtractionProcessor - REGEX field must be specified tranform_name=set_sourcetype
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;My opinion: this error is a bug. In practice, a &lt;CODE&gt;REGEX&lt;/CODE&gt; is not required for this stanza.&lt;/P&gt;

&lt;P&gt;Nit: Splunk, please correct the typo &lt;CODE&gt;tranform_name&lt;/CODE&gt; (sic; note the missing "s") in the error text.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2019 07:58:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481413#M134923</guid>
      <dc:creator>Graham_Hanningt</dc:creator>
      <dc:date>2019-11-15T07:58:42Z</dc:date>
    </item>
    <item>
      <title>Re: Performantly overriding sourcetype per event with new replacement string, not backreference?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481414#M134924</link>
      <description>&lt;P&gt;VERY nicely done!  I like it.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2019 15:48:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481414#M134924</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2019-11-15T15:48:55Z</dc:date>
    </item>
    <item>
      <title>Re: Performantly overriding sourcetype per event with new replacement string, not backreference?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481415#M134925</link>
      <description>&lt;P&gt;A Splunk docs contact has responded to my feedback (thank you!), and confirmed that, as of Splunk 8.0.0, Splunk doesn't support functions specific to PCRE2, such as these substitution functions.&lt;/P&gt;</description>
      <pubDate>Thu, 21 Nov 2019 03:12:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Performantly-overriding-sourcetype-per-event-with-new/m-p/481415#M134925</guid>
      <dc:creator>Graham_Hanningt</dc:creator>
      <dc:date>2019-11-21T03:12:56Z</dc:date>
    </item>
  </channel>
</rss>

