<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Extracting the User-Agent HTTP header from an Apache log in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Extracting-the-User-Agent-HTTP-header-from-an-Apache-log/m-p/19891#M3069</link>
    <description>&lt;P&gt;You need to either build a lookup table or use a custom command to parse the user agent string.  Looks like this might do the trick:&lt;/P&gt;

&lt;P&gt;&lt;A href="http://splunk-base.splunk.com/apps/48017/ta-uas_parser"&gt;http://splunk-base.splunk.com/apps/48017/ta-uas_parser&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 30 Jul 2013 22:36:57 GMT</pubDate>
    <dc:creator>jstockamp</dc:creator>
    <dc:date>2013-07-30T22:36:57Z</dc:date>
    <item>
      <title>Extracting the User-Agent HTTP header from an Apache log</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Extracting-the-User-Agent-HTTP-header-from-an-Apache-log/m-p/19890#M3068</link>
      <description>&lt;P&gt;Looking at all the posts regarding User-Agent HTTP header searches, one of the commonalities is that they were told to change their  format to Combined Log Format.  I unfortunately cannot do that but I am still being asked to create a dashboard reports to show most common OS used and most common browser.  Here is a log:&lt;/P&gt;

&lt;P&gt;XX.XX.XX.XX - - [30/Jul/2013:15:16:40 -0700] 0 "GET /portal-web/images/denied.png HTTP/1.1" 200 882 "htps://ABC.ABC.com/portal-web/stuff/stuff.action" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.0)"&lt;/P&gt;

&lt;P&gt;Ultimately I want separate count columns for browser type and OS type.  How do I go about extracting the info I want?  I believe I need to use a Regex statement, but I am unsure on how to proceed especially since both the client and browser are going to change in size?&lt;/P&gt;</description>
      <pubDate>Tue, 30 Jul 2013 22:30:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Extracting-the-User-Agent-HTTP-header-from-an-Apache-log/m-p/19890#M3068</guid>
      <dc:creator>Armyeric</dc:creator>
      <dc:date>2013-07-30T22:30:53Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting the User-Agent HTTP header from an Apache log</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Extracting-the-User-Agent-HTTP-header-from-an-Apache-log/m-p/19891#M3069</link>
      <description>&lt;P&gt;You need to either build a lookup table or use a custom command to parse the user agent string.  Looks like this might do the trick:&lt;/P&gt;

&lt;P&gt;&lt;A href="http://splunk-base.splunk.com/apps/48017/ta-uas_parser"&gt;http://splunk-base.splunk.com/apps/48017/ta-uas_parser&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 30 Jul 2013 22:36:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Extracting-the-User-Agent-HTTP-header-from-an-Apache-log/m-p/19891#M3069</guid>
      <dc:creator>jstockamp</dc:creator>
      <dc:date>2013-07-30T22:36:57Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting the User-Agent HTTP header from an Apache log</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Extracting-the-User-Agent-HTTP-header-from-an-Apache-log/m-p/19892#M3070</link>
      <description>&lt;P&gt;I would love to use an app, but our Admin doesn't want to use any apps...so I am stuck.&lt;/P&gt;</description>
      <pubDate>Tue, 30 Jul 2013 22:40:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Extracting-the-User-Agent-HTTP-header-from-an-Apache-log/m-p/19892#M3070</guid>
      <dc:creator>Armyeric</dc:creator>
      <dc:date>2013-07-30T22:40:53Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting the User-Agent HTTP header from an Apache log</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Extracting-the-User-Agent-HTTP-header-from-an-Apache-log/m-p/19893#M3071</link>
      <description>&lt;P&gt;A pure regex is not going to do it alone.  If you are a novice you can get some help for yourself by using the interactive field extraction creator. It is one of the options in the per-record drop down.&lt;/P&gt;

&lt;P&gt;&lt;IMG src="http://splunk-base.splunk.com/storage/Splunk-field-extraction.png" alt="alt text" /&gt;&lt;/P&gt;

&lt;P&gt;The difficulty is that there is no defined order or format for sub fields of the UA.  I just tried myself with the following sample list culled from recent access logs for the generator to weave its magic on:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Windows NT 5.1
Linux x86_64
Windows NT 6.0
Android 4.1.2
Windows Phone OS 7.5
Windows NT 6.1
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The resulting sample extractions it offered were:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Linux x86_64
Windows NT 5.1
+http://yandex.com/bots)" RU
Windows NT 5.1)" US
&lt;A href="http://www.majestic12.co.uk/bot.php?+)&amp;quot;" target="test_blank"&gt;http://www.majestic12.co.uk/bot.php?+)"&lt;/A&gt;; US
rv:17.0) Gecko/20130626 Firefox/17.0 Iceweasel/17.0.7" FR
+http://www.exabot.com/go/robot)" FR
Windows NT 6.2
Mail.RU_Bot/2.0
Windows NT 6.0)" JP
Windows NT 6.1
Windows NT 6.0)" CN
+http://www.google.com/bot.html)" US
Android 4.1.2
+http://www.bing.com/bingbot.htm)" US
+http://www.baidu.com/search/spider.html)" CN
Windows Phone OS 7.5
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Even after some manual refinement it continues to miss the mark more than hit it.&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jul 2013 00:25:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Extracting-the-User-Agent-HTTP-header-from-an-Apache-log/m-p/19893#M3071</guid>
      <dc:creator>grijhwani</dc:creator>
      <dc:date>2013-07-31T00:25:42Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting the User-Agent HTTP header from an Apache log</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Extracting-the-User-Agent-HTTP-header-from-an-Apache-log/m-p/19894#M3072</link>
      <description>&lt;P&gt;Correct. There is no way to do this just by parsing. UA strings are not strongly-specified, they are mostly suggestive. If you need great accuracy, you must use a lookup that maps known patterns to the item you want. (I mean, technically, you can probably write a regex that includes all the logic of a lookup table, but it would be an impractically enormous regex, so let's just say you can't.)&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jul 2013 02:08:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Extracting-the-User-Agent-HTTP-header-from-an-Apache-log/m-p/19894#M3072</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2013-07-31T02:08:32Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting the User-Agent HTTP header from an Apache log</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Extracting-the-User-Agent-HTTP-header-from-an-Apache-log/m-p/19895#M3073</link>
      <description>&lt;P&gt;If you want to job done right, you pretty much need an application. There is no simple way to parse a UA string. It requires either a massive lookup, or a combination of complex logic and a slightly-less-massive lookup. If you have a limited number of UA strings, your best bet is to simply enumerate them all into your own lookup, then set any others to "other" or something.&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jul 2013 02:10:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Extracting-the-User-Agent-HTTP-header-from-an-Apache-log/m-p/19895#M3073</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2013-07-31T02:10:34Z</dc:date>
    </item>
  </channel>
</rss>

