<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Can I transform data and extract fields at once? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Can-I-transform-data-and-extract-fields-at-once/m-p/175553#M50391</link>
    <description>&lt;P&gt;I don't really care where this extraction is happening. I'm fine with anywhere, as long as it's fast and easy to do. I just want to use the fields. I'm only using the _raw data because that's what the &lt;A href="http://wiki.splunk.com/Community:StripSyslog"&gt;Splunks docs suggest&lt;/A&gt;, and I'm using the default Transform named &lt;CODE&gt;syslog-header-stripper-ts-host-proc&lt;/CODE&gt; from &lt;CODE&gt;default/transforms.conf&lt;/CODE&gt;.&lt;/P&gt;</description>
    <pubDate>Thu, 08 Jan 2015 01:32:32 GMT</pubDate>
    <dc:creator>stefanlasiewski</dc:creator>
    <dc:date>2015-01-08T01:32:32Z</dc:date>
    <item>
      <title>Can I transform data and extract fields at once?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Can-I-transform-data-and-extract-fields-at-once/m-p/175550#M50388</link>
      <description>&lt;P&gt;Our Splunk server receives data via syslog. As a result, I need to transform the syslog data using transforms.conf and props.conf (Details in the question "&lt;A href="http://answers.splunk.com/answers/206210/why-does-splunk-not-recognize-standard-fields-in-m-1.html"&gt;Why does Splunk not recognize standard fields in my Apache data forwarded by syslog?&lt;/A&gt;".&lt;/P&gt;

&lt;P&gt;My question, can I transform the data and still do some field extraction on that data? I would like to preserve the &lt;CODE&gt;process&lt;/CODE&gt; field. However, the default transform simply strips out the data. It doesn't save any of the fields.&lt;/P&gt;

&lt;P&gt;So, given the following transformation in &lt;CODE&gt;local/props.conf&lt;/CODE&gt;:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[syslog]
TRANSFORMS-strip-syslog-header = syslog-header-stripper-ts-host-proc
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;And this default transform from &lt;CODE&gt;default/transforms.conf&lt;/CODE&gt;:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;# This will strip out date stamp, host, process with pid and just get the 
# actual message
[syslog-header-stripper-ts-host-proc]
REGEX         = ^[A-Z][a-z]+\s+\d+\s\d+:\d+:\d+\s.*?:\s(.*)$
FORMAT        = $1
DEST_KEY      = _raw
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Can I somehow preserve one of the fields and save it to the name of &lt;CODE&gt;process&lt;/CODE&gt;?&lt;/P&gt;

&lt;P&gt;I have had some luck with the following pattern, saved at &lt;A href="https://www.regex101.com/r/iK8iX5/1"&gt;https://www.regex101.com/r/iK8iX5/1&lt;/A&gt; . However, I am uncertain how to use this in a Splunk Transform.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;^(?&amp;lt;SyslogPri&amp;gt;&amp;lt;\d+&amp;gt;)(?&amp;lt;SyslogDate&amp;gt;[A-Z][a-z]+\s+\d+\s\d+:\d+:\d+)\s(?&amp;lt;SyslogHost&amp;gt;.*)\s(?&amp;lt;process&amp;gt;.*):\s(?&amp;lt;SyslogMessage&amp;gt;.*)$
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 07 Jan 2015 19:42:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Can-I-transform-data-and-extract-fields-at-once/m-p/175550#M50388</guid>
      <dc:creator>stefanlasiewski</dc:creator>
      <dc:date>2015-01-07T19:42:54Z</dc:date>
    </item>
    <item>
      <title>Re: Can I transform data and extract fields at once?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Can-I-transform-data-and-extract-fields-at-once/m-p/175551#M50389</link>
      <description>&lt;P&gt;You can't at the index time, but at search time as FORMAT take multi name-value pairs.&lt;/P&gt;

&lt;H1&gt;Refer : &lt;A href="http://docs.splunk.com/Documentation/Splunk/6.2.1/Admin/Transformsconf"&gt;http://docs.splunk.com/Documentation/Splunk/6.2.1/Admin/Transformsconf&lt;/A&gt; &lt;/H1&gt;

&lt;P&gt;FORMAT = &lt;BR /&gt;
* NOTE: This option is valid for both index-time and search-time field extraction. However, FORMAT &lt;BR /&gt;
  behaves differently depending on whether the extraction is performed at index time or &lt;BR /&gt;
  search time.&lt;BR /&gt;
* This attribute specifies the format of the event, including any field names or values you want &lt;BR /&gt;
  to add.&lt;BR /&gt;
* FORMAT for index-time extractions:&lt;BR /&gt;
    * Use $n (for example $1, $2, etc) to specify the output of each REGEX match. &lt;BR /&gt;
    * If REGEX does not have n groups, the matching fails. &lt;BR /&gt;
    * The special identifier $0 represents what was in the DEST_KEY before the REGEX was performed.&lt;BR /&gt;
    * At index time only, you can use FORMAT to create concatenated fields:&lt;BR /&gt;
        * FORMAT = ipaddress::$1.$2.$3.$4&lt;BR /&gt;
    * When you create concatenated fields with FORMAT, "$" is the only special character. It is &lt;BR /&gt;
      treated as a prefix for regex-capturing groups only if it is followed by a number and only &lt;BR /&gt;
      if the number applies to an existing capturing group. So if REGEX has only one capturing &lt;BR /&gt;
      group and its value is "bar", then:&lt;BR /&gt;
        * "FORMAT = foo$1" yields "foobar"&lt;BR /&gt;
        * "FORMAT = foo$bar" yields "foo$bar"&lt;BR /&gt;
        * "FORMAT = foo$1234" yields "foo$1234"&lt;BR /&gt;
        * "FORMAT = foo$1\$2" yields "foobar\$2"&lt;BR /&gt;
    * At index-time, FORMAT defaults to &lt;STANZA-NAME&gt;::$1&lt;BR /&gt;
* FORMAT for search-time extractions:&lt;BR /&gt;
    * The format of this field as used during search time extractions is as follows:&lt;BR /&gt;
        * FORMAT = &lt;FIELD-NAME&gt;::&lt;FIELD-VALUE&gt;( &lt;FIELD-NAME&gt;::&lt;FIELD-VALUE&gt;)* &lt;BR /&gt;
            * where:&lt;BR /&gt;
            * field-name  = [&lt;STRING&gt;|$&lt;EXTRACTING-GROUP-NUMBER&gt;]&lt;BR /&gt;
            * field-value = [&lt;STRING&gt;|$&lt;EXTRACTING-GROUP-NUMBER&gt;]&lt;/EXTRACTING-GROUP-NUMBER&gt;&lt;/STRING&gt;&lt;/EXTRACTING-GROUP-NUMBER&gt;&lt;/STRING&gt;&lt;/FIELD-VALUE&gt;&lt;/FIELD-NAME&gt;&lt;/FIELD-VALUE&gt;&lt;/FIELD-NAME&gt;&lt;/STANZA-NAME&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;* Search-time extraction examples:

    * 1. FORMAT = first::$1 second::$2 third::other-value 

    * 2. FORMAT = $1::$2 

* If the key-name of a FORMAT setting is varying, for example $1 in the
  example 2 just above, then the regex will continue to match against
  the source key to extract as many matches as are present in the text.
* NOTE: You cannot create concatenated fields with FORMAT at search time. That 
  functionality is only available at index time.
* At search-time, FORMAT defaults to an empty string
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 07 Jan 2015 21:38:33 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Can-I-transform-data-and-extract-fields-at-once/m-p/175551#M50389</guid>
      <dc:creator>jayannah</dc:creator>
      <dc:date>2015-01-07T21:38:33Z</dc:date>
    </item>
    <item>
      <title>Re: Can I transform data and extract fields at once?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Can-I-transform-data-and-extract-fields-at-once/m-p/175552#M50390</link>
      <description>&lt;P&gt;Hello @stefanlasiewski,&lt;BR /&gt;
Your regex statement will work just find by simply adding it to the REGEX settings  in the transforms.conf.  I do what you are doing all the time.&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;&lt;BR /&gt;
&lt;/CODE&gt;&lt;CODE&gt;&lt;/CODE&gt;&lt;PRE&gt;&lt;CODE&gt;&lt;BR /&gt;
[syslog-header-stripper-ts-host-proc]&lt;BR /&gt;
 REGEX = yourRegex statement&lt;BR /&gt;
&lt;/CODE&gt;&lt;/PRE&gt;&lt;BR /&gt;
&lt;/P&gt;

&lt;P&gt;This will working for search search time extraction, but are you trying to create an Index time extract?   In your example it seems like you are trying to overwrite the _raw data.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Jan 2015 00:37:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Can-I-transform-data-and-extract-fields-at-once/m-p/175552#M50390</guid>
      <dc:creator>bmacias84</dc:creator>
      <dc:date>2015-01-08T00:37:17Z</dc:date>
    </item>
    <item>
      <title>Re: Can I transform data and extract fields at once?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Can-I-transform-data-and-extract-fields-at-once/m-p/175553#M50391</link>
      <description>&lt;P&gt;I don't really care where this extraction is happening. I'm fine with anywhere, as long as it's fast and easy to do. I just want to use the fields. I'm only using the _raw data because that's what the &lt;A href="http://wiki.splunk.com/Community:StripSyslog"&gt;Splunks docs suggest&lt;/A&gt;, and I'm using the default Transform named &lt;CODE&gt;syslog-header-stripper-ts-host-proc&lt;/CODE&gt; from &lt;CODE&gt;default/transforms.conf&lt;/CODE&gt;.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Jan 2015 01:32:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Can-I-transform-data-and-extract-fields-at-once/m-p/175553#M50391</guid>
      <dc:creator>stefanlasiewski</dc:creator>
      <dc:date>2015-01-08T01:32:32Z</dc:date>
    </item>
    <item>
      <title>Re: Can I transform data and extract fields at once?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Can-I-transform-data-and-extract-fields-at-once/m-p/175554#M50392</link>
      <description>&lt;P&gt;Can you show me an example that you use for the &lt;CODE&gt;FORMAT&lt;/CODE&gt; and &lt;CODE&gt;DEST_KEY&lt;/CODE&gt;? I'm confused by how those should be used.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Jan 2015 01:34:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Can-I-transform-data-and-extract-fields-at-once/m-p/175554#M50392</guid>
      <dc:creator>stefanlasiewski</dc:creator>
      <dc:date>2015-01-08T01:34:12Z</dc:date>
    </item>
  </channel>
</rss>

