<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Do we we have to write a custom transform for our Apache combined access log format for proper field extraction? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Do-we-we-have-to-write-a-custom-transform-for-our-Apache/m-p/123982#M33486</link>
    <description>&lt;P&gt;We have the below Apache log format on our apache conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;LogFormat "%{True-Client-IP}i %h %l %u %t \"%r\" %&amp;gt;s %b \"%{Referer}i\" \"%{User-Agent}i\" %D \"%{x-wily-servlet}o\""
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This is logged as: &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;24.96.82.143 24.143.197.191 - - [30/Mar/2015:13:03:45 -0400] "GET /AST/Main/Belk_Primary/PRD~99999998368WACO/Wacoal+Wacoal+Embrace+Lace+Collection.jsp?navPath=Wacoal&amp;amp;boutiquePage=true&amp;amp;ZZ%3C%3EtP=4294948624&amp;amp;ZZ_OPT=Y&amp;amp;PRODUCT%3C%3Eprd_id=845524442450490&amp;amp;FOLDER%3C%3Efolder_id=2534374302087929&amp;amp;bmUID=kNFqGZg&amp;amp;ViewAll=&amp;amp;changeViewInd=y HTTP/1.1" 200 42823 "http://www.belk.com/AST/Boutiques/Boutiques_Primary/Wacoal.jsp" "Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko/20100101 Firefox/22.0" 168176140 "Clear appServerIp=74.213.129.193&amp;amp;agentName=WSPRD08F&amp;amp;servletName=__belk_outfit_detail&amp;amp;servletResponseTime=168097&amp;amp;agentHost=belkecaprd20&amp;amp;agentProcess=WebLogic"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Looking at the default extraction for access on transforms does not match our format. Does this mean we have to write a custom transform for our log?  Please confirm.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;REGEX = ^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?[[bc_domain:referer_]]?+[^"]*+)"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Tue, 31 Mar 2015 19:41:38 GMT</pubDate>
    <dc:creator>aruncse83</dc:creator>
    <dc:date>2015-03-31T19:41:38Z</dc:date>
    <item>
      <title>Do we we have to write a custom transform for our Apache combined access log format for proper field extraction?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Do-we-we-have-to-write-a-custom-transform-for-our-Apache/m-p/123982#M33486</link>
      <description>&lt;P&gt;We have the below Apache log format on our apache conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;LogFormat "%{True-Client-IP}i %h %l %u %t \"%r\" %&amp;gt;s %b \"%{Referer}i\" \"%{User-Agent}i\" %D \"%{x-wily-servlet}o\""
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This is logged as: &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;24.96.82.143 24.143.197.191 - - [30/Mar/2015:13:03:45 -0400] "GET /AST/Main/Belk_Primary/PRD~99999998368WACO/Wacoal+Wacoal+Embrace+Lace+Collection.jsp?navPath=Wacoal&amp;amp;boutiquePage=true&amp;amp;ZZ%3C%3EtP=4294948624&amp;amp;ZZ_OPT=Y&amp;amp;PRODUCT%3C%3Eprd_id=845524442450490&amp;amp;FOLDER%3C%3Efolder_id=2534374302087929&amp;amp;bmUID=kNFqGZg&amp;amp;ViewAll=&amp;amp;changeViewInd=y HTTP/1.1" 200 42823 "http://www.belk.com/AST/Boutiques/Boutiques_Primary/Wacoal.jsp" "Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko/20100101 Firefox/22.0" 168176140 "Clear appServerIp=74.213.129.193&amp;amp;agentName=WSPRD08F&amp;amp;servletName=__belk_outfit_detail&amp;amp;servletResponseTime=168097&amp;amp;agentHost=belkecaprd20&amp;amp;agentProcess=WebLogic"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Looking at the default extraction for access on transforms does not match our format. Does this mean we have to write a custom transform for our log?  Please confirm.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;REGEX = ^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?[[bc_domain:referer_]]?+[^"]*+)"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 31 Mar 2015 19:41:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Do-we-we-have-to-write-a-custom-transform-for-our-Apache/m-p/123982#M33486</guid>
      <dc:creator>aruncse83</dc:creator>
      <dc:date>2015-03-31T19:41:38Z</dc:date>
    </item>
    <item>
      <title>Re: Do we we have to write a custom transform for our Apache combined access log format for proper field extraction?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Do-we-we-have-to-write-a-custom-transform-for-our-Apache/m-p/123983#M33487</link>
      <description>&lt;P&gt;Probably yes.  If you are changing the format of an event that Splunk does extraction on via regex then you should expect to have to make your own regex.  Because face it, you're not really using "Apache Combined Log Format" anymore, you're using "aruncse83 Apache-Combined-Like Log Format".  &lt;/P&gt;

&lt;P&gt;There is one way I have done this before and have good success is by only adding things to the END of the Apache Combined format - and then, adding those things strictly as key=value items.  In this way, all of the regex stuff still matches the standard Apache format, and key=value data is extracted just fine by Splunk's default KV-extract code.  You get the benefits of your custom format without any of the pain associated.&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2015 03:01:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Do-we-we-have-to-write-a-custom-transform-for-our-Apache/m-p/123983#M33487</guid>
      <dc:creator>dwaddle</dc:creator>
      <dc:date>2015-04-02T03:01:55Z</dc:date>
    </item>
    <item>
      <title>Re: Do we we have to write a custom transform for our Apache combined access log format for proper field extraction?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Do-we-we-have-to-write-a-custom-transform-for-our-Apache/m-p/123984#M33488</link>
      <description>&lt;P&gt;Lets face it, this is a great answer&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2015 04:17:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Do-we-we-have-to-write-a-custom-transform-for-our-Apache/m-p/123984#M33488</guid>
      <dc:creator>aljohnson_splun</dc:creator>
      <dc:date>2015-04-02T04:17:34Z</dc:date>
    </item>
    <item>
      <title>Re: Do we we have to write a custom transform for our Apache combined access log format for proper field extraction?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Do-we-we-have-to-write-a-custom-transform-for-our-Apache/m-p/123985#M33489</link>
      <description>&lt;P&gt;Thank you dwaddle for the above reply... This is exactly what I did just after posting the question... So I changed the regex to match the additional field which is logged on our apache... which is &lt;CODE&gt;^[[nspaces:clientip]]\s++&lt;/CODE&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; REGEX = ^[[nspaces:clientip]]\s++^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?[[bc_domain:referer_]]?+[^"]*+)"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;so that fixed the problem with the default extraction... I agree with you on the recommendation to move all the custom fields to the last in key value format, ( that is standard norms) we should probably do this some time later. At this point it is easy for me to make the  change at splunk side and extract these, rather than adjusting the web server which warrants additional paper work...&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2015 10:58:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Do-we-we-have-to-write-a-custom-transform-for-our-Apache/m-p/123985#M33489</guid>
      <dc:creator>aruncse83</dc:creator>
      <dc:date>2015-04-02T10:58:20Z</dc:date>
    </item>
    <item>
      <title>Re: Do we we have to write a custom transform for our Apache combined access log format for proper field extraction?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Do-we-we-have-to-write-a-custom-transform-for-our-Apache/m-p/123986#M33490</link>
      <description>&lt;P&gt;Glad it worked for you.  One thing I would check though is that you're extracting into a field named &lt;CODE&gt;clientip&lt;/CODE&gt; twice in this case.  Do you mean to do it like that?   IF you do that's probably fine and I would expect &lt;CODE&gt;clientip&lt;/CODE&gt; to become multivalued.  Contextually it's kinda weird to me, but you know your data best.&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2015 13:10:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Do-we-we-have-to-write-a-custom-transform-for-our-Apache/m-p/123986#M33490</guid>
      <dc:creator>dwaddle</dc:creator>
      <dc:date>2015-04-02T13:10:29Z</dc:date>
    </item>
    <item>
      <title>Re: Do we we have to write a custom transform for our Apache combined access log format for proper field extraction?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Do-we-we-have-to-write-a-custom-transform-for-our-Apache/m-p/123987#M33491</link>
      <description>&lt;P&gt;It is actually remote ip&lt;/P&gt;</description>
      <pubDate>Fri, 03 Apr 2015 01:58:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Do-we-we-have-to-write-a-custom-transform-for-our-Apache/m-p/123987#M33491</guid>
      <dc:creator>aruncse83</dc:creator>
      <dc:date>2015-04-03T01:58:54Z</dc:date>
    </item>
  </channel>
</rss>

