<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Transform, but only when matching this RE in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Transform-but-only-when-matching-this-RE/m-p/23733#M4266</link>
    <description>&lt;P&gt;I get lots of data from various systems via syslog. One of my systems sends me data that looks like this&lt;/P&gt;

&lt;P&gt;HEADERTEXT: name=value;name=value;name=value.......&lt;/P&gt;

&lt;P&gt;I have a generic transform written to extract the name, value pairs. The problem is, I have other data that looks like this&lt;/P&gt;

&lt;P&gt;SOMEOTHERHEADER: &lt;A href="http://www.blah.com/servlet?name=value;name=value" rel="nofollow"&gt;http://www.blah.com/servlet?name=value;name=value&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;What I am finding is that the name/value extract from my first transform is getting applied to data from the second as well. WHat I would like todo is, somehow in the props.conf say&lt;/P&gt;

&lt;P&gt;"Only apply this stanza if this RE is matched". I would then put the RE as "HEADERTEXT".&lt;/P&gt;

&lt;P&gt;Anyone have any pointers on if something like this is possible ? I can't put HEADERTEXT in the RE in the transform.conf as it's a recursive RE for extracting multiple kv's/&lt;/P&gt;

&lt;P&gt;Here are some samples, plus my matching RE's from transform.conf. As you can see, the User-Agent in the first example (DATA1) actually causes the data to match both REGEX1 and REGEX2, causing the data to be tagged with both sourcetypes.&lt;/P&gt;

&lt;P&gt;DATA1&lt;/P&gt;

&lt;P&gt;Aug 2 21:54:32 10.1.2.3 tmm[1853]: Rule syslog_http : HTTP,10.1.2.4:5804,vs_https_oursite,4.4.4.3:49788,oururl.com,/somepath,10.1.2.5:7001,302,2,http://somewhere.gov/,GET,'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; aff-kingsoft-ciba; staticlogin:product=cboxf09&amp;amp;act=login&amp;amp;info=ZmlsZW5hbWU9UG93ZXJ3b; SE 2.X)',''&lt;/P&gt;

&lt;P&gt;REGEX1&lt;/P&gt;

&lt;P&gt;tmm[\d+]: Rule syslog_http &amp;lt;(?:HTTP_(?:RESPONSE|REQUEST)|LB_FAILED)&amp;gt;: (?:HTTP|HTTP-ERROR|LB-ERROR),([\d|.]+):([\d]+),([\w]+),([\d|.]+):([\d]+),([\w\d:.-]+),([^,?]+)(\?[^,]),(?:([\d|.]+):([\d]+))?,([\d]+),([\d]),([^,]),([^,]),'([^,])','([^,]*)'&lt;/P&gt;

&lt;P&gt;DATA2&lt;/P&gt;

&lt;P&gt;Aug 2 01:30:01 10.120.17.247 user:01:30:02.019 INFO SummaryData - SUMMARY:name1=value1;name2=value2;name3=value3;&lt;/P&gt;

&lt;P&gt;REGEX2&lt;/P&gt;

&lt;P&gt;([_a-z]+)=([^;]+);&lt;/P&gt;</description>
    <pubDate>Sat, 31 Jul 2010 19:27:33 GMT</pubDate>
    <dc:creator>serialmonkey</dc:creator>
    <dc:date>2010-07-31T19:27:33Z</dc:date>
    <item>
      <title>Transform, but only when matching this RE</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transform-but-only-when-matching-this-RE/m-p/23733#M4266</link>
      <description>&lt;P&gt;I get lots of data from various systems via syslog. One of my systems sends me data that looks like this&lt;/P&gt;

&lt;P&gt;HEADERTEXT: name=value;name=value;name=value.......&lt;/P&gt;

&lt;P&gt;I have a generic transform written to extract the name, value pairs. The problem is, I have other data that looks like this&lt;/P&gt;

&lt;P&gt;SOMEOTHERHEADER: &lt;A href="http://www.blah.com/servlet?name=value;name=value" rel="nofollow"&gt;http://www.blah.com/servlet?name=value;name=value&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;What I am finding is that the name/value extract from my first transform is getting applied to data from the second as well. WHat I would like todo is, somehow in the props.conf say&lt;/P&gt;

&lt;P&gt;"Only apply this stanza if this RE is matched". I would then put the RE as "HEADERTEXT".&lt;/P&gt;

&lt;P&gt;Anyone have any pointers on if something like this is possible ? I can't put HEADERTEXT in the RE in the transform.conf as it's a recursive RE for extracting multiple kv's/&lt;/P&gt;

&lt;P&gt;Here are some samples, plus my matching RE's from transform.conf. As you can see, the User-Agent in the first example (DATA1) actually causes the data to match both REGEX1 and REGEX2, causing the data to be tagged with both sourcetypes.&lt;/P&gt;

&lt;P&gt;DATA1&lt;/P&gt;

&lt;P&gt;Aug 2 21:54:32 10.1.2.3 tmm[1853]: Rule syslog_http : HTTP,10.1.2.4:5804,vs_https_oursite,4.4.4.3:49788,oururl.com,/somepath,10.1.2.5:7001,302,2,http://somewhere.gov/,GET,'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; aff-kingsoft-ciba; staticlogin:product=cboxf09&amp;amp;act=login&amp;amp;info=ZmlsZW5hbWU9UG93ZXJ3b; SE 2.X)',''&lt;/P&gt;

&lt;P&gt;REGEX1&lt;/P&gt;

&lt;P&gt;tmm[\d+]: Rule syslog_http &amp;lt;(?:HTTP_(?:RESPONSE|REQUEST)|LB_FAILED)&amp;gt;: (?:HTTP|HTTP-ERROR|LB-ERROR),([\d|.]+):([\d]+),([\w]+),([\d|.]+):([\d]+),([\w\d:.-]+),([^,?]+)(\?[^,]),(?:([\d|.]+):([\d]+))?,([\d]+),([\d]),([^,]),([^,]),'([^,])','([^,]*)'&lt;/P&gt;

&lt;P&gt;DATA2&lt;/P&gt;

&lt;P&gt;Aug 2 01:30:01 10.120.17.247 user:01:30:02.019 INFO SummaryData - SUMMARY:name1=value1;name2=value2;name3=value3;&lt;/P&gt;

&lt;P&gt;REGEX2&lt;/P&gt;

&lt;P&gt;([_a-z]+)=([^;]+);&lt;/P&gt;</description>
      <pubDate>Sat, 31 Jul 2010 19:27:33 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transform-but-only-when-matching-this-RE/m-p/23733#M4266</guid>
      <dc:creator>serialmonkey</dc:creator>
      <dc:date>2010-07-31T19:27:33Z</dc:date>
    </item>
    <item>
      <title>Re: Transform, but only when matching this RE</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transform-but-only-when-matching-this-RE/m-p/23734#M4267</link>
      <description>&lt;P&gt;Your problem seems to be that both of your field extractions are applied using the same spec, which I imagine is the syslog sourcetype.&lt;/P&gt;

&lt;P&gt;Take a look at your props.conf to find out what spec is being used for that extraction.&lt;/P&gt;

&lt;P&gt;&lt;A href="http://www.splunk.com/base/Documentation/latest/Knowledge/Createandmaintainsearch-timefieldextractionsthroughconfigurationfiles#Add_a_regex_stanza_to_props.conf" rel="nofollow"&gt;http://www.splunk.com/base/Documentation/latest/Knowledge/Createandmaintainsearch-timefieldextractionsthroughconfigurationfiles#Add_a_regex_stanza_to_props.conf&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;[]
EXTRACT- = &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;* &amp;lt;spec&amp;gt; can be:
      o &amp;lt;sourcetype&amp;gt;, the source type of an event.
      o host::&amp;lt;host&amp;gt;, where &amp;lt;host&amp;gt; is the host for an event.
      o source::&amp;lt;source&amp;gt;, where &amp;lt;source&amp;gt; is the source for an event. 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;If it is indeed based on the [sourcetype] spec, then you have two options :&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;&lt;P&gt;If each set of events (HEADERTEXT, SOMEHEADERTEXT) are coming from different hosts, then you could simply define one field extraction for each with [host] as the discriminating spec.&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;P&gt;If you are unable to differentiate by host, then you may have to stick with [sourcetype] as the discriminating spec, in which case I suggest that you set up a regex-based sourcetype override in order to assign a custom sourcetype to each type of event.&lt;/P&gt;&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;&lt;A href="http://www.splunk.com/base/Documentation/latest/Admin/Advancedsourcetypeoverrides#Configuration" rel="nofollow"&gt;http://www.splunk.com/base/Documentation/latest/Admin/Advancedsourcetypeoverrides#Configuration&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Then, you can re-write your extractions to be specific to the sourcetypes you just defined. Make sure that the sourcetype assignment happens before the field extraction in transforms.conf.&lt;/P&gt;</description>
      <pubDate>Sat, 31 Jul 2010 23:57:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transform-but-only-when-matching-this-RE/m-p/23734#M4267</guid>
      <dc:creator>hexx</dc:creator>
      <dc:date>2010-07-31T23:57:26Z</dc:date>
    </item>
    <item>
      <title>Re: Transform, but only when matching this RE</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transform-but-only-when-matching-this-RE/m-p/23735#M4268</link>
      <description>&lt;P&gt;Thanks for the answer.&lt;/P&gt;

&lt;P&gt;I can't split the data based on host unfortunatly.&lt;/P&gt;

&lt;P&gt;What I have currently is two entries in my props.conf - both with a [syslog] stanza representing the syslog input type (as you guessed). Both of these link to seperate TRANSFORMS.&lt;/P&gt;

&lt;P&gt;The problem I have is that in some cases I have data which matches both transforms. What I see in this case when I search for this data is that the sourcetype attribute actually appears three times on that search result (once for sourcetype=syslog, and additionally for the other two transforms).&lt;/P&gt;

&lt;P&gt;One of my regex's is a name=value style regex. The other is more concrete. I guess what I can do is add a component to my concrete regex that blocks it matching the name=value style data. It's messy, but should work.&lt;/P&gt;</description>
      <pubDate>Sun, 01 Aug 2010 19:15:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transform-but-only-when-matching-this-RE/m-p/23735#M4268</guid>
      <dc:creator>serialmonkey</dc:creator>
      <dc:date>2010-08-01T19:15:28Z</dc:date>
    </item>
    <item>
      <title>Re: Transform, but only when matching this RE</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transform-but-only-when-matching-this-RE/m-p/23736#M4269</link>
      <description>&lt;P&gt;show some sample data and the two regexes in props/transforms.&lt;/P&gt;</description>
      <pubDate>Sun, 01 Aug 2010 23:29:58 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transform-but-only-when-matching-this-RE/m-p/23736#M4269</guid>
      <dc:creator>Genti</dc:creator>
      <dc:date>2010-08-01T23:29:58Z</dc:date>
    </item>
    <item>
      <title>Re: Transform, but only when matching this RE</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transform-but-only-when-matching-this-RE/m-p/23737#M4270</link>
      <description>&lt;P&gt;samples added above&lt;/P&gt;</description>
      <pubDate>Tue, 03 Aug 2010 19:49:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transform-but-only-when-matching-this-RE/m-p/23737#M4270</guid>
      <dc:creator>serialmonkey</dc:creator>
      <dc:date>2010-08-03T19:49:48Z</dc:date>
    </item>
    <item>
      <title>Re: Transform, but only when matching this RE</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transform-but-only-when-matching-this-RE/m-p/23738#M4271</link>
      <description>&lt;P&gt;In the end, I just to work more complexity into my RE's. Unfortunatly using the funky name=value style RE's can match more than you intend, especially if you push lots of data in via the same input (i.e. syslog)&lt;/P&gt;</description>
      <pubDate>Sun, 08 Aug 2010 06:57:35 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transform-but-only-when-matching-this-RE/m-p/23738#M4271</guid>
      <dc:creator>serialmonkey</dc:creator>
      <dc:date>2010-08-08T06:57:35Z</dc:date>
    </item>
  </channel>
</rss>

