Splunk Enterprise

How to extract using Browser Version from useragent

shashank_24
Path Finder

Hi, Is there a easy and straight forward way of extracting browser versions from access logs using Useragent string.

I've a requirement where I have to list out top browsers and top versions of the browser. I was able to manage to extract the browser using the below eval expression  but getting the browser versions are tricky.

 

 

 

| eval browser = case(match(useragent,"Firefox"),"FireFox", match(useragent,"Chrome") AND NOT match(useragent,"Edge"),"Chrome", match(useragent,"Safari") AND NOT match(useragent,"Chrome"),"Safari", match(useragent, "MSIE|Trident|Edge"), "IE", NOT match(useragent, "Chrome|Firefox|Safari|MSIE|Trident|Edge"), "OTHERS")

 

 

 

Has someone done that before and help me steer into right direction. I can't install any app so it has to be done via some regex. Please let me know if someone can help. Very much appreciated in advance

Some examples of Useragent Strings -

Mozilla/5.0 (Linux; Android 9; ANE-LX1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.87 Mobile Safari/537.36

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.3 Safari/605.1.15

Mozilla/5.0 (Linux; Android 9; ANE-LX1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.87 Mobile Safari/537.36

Mozilla/5.0 (iPhone; CPU iPhone OS 15_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) GSA/198.0.425262635 Mobile/15E148 Safari/604.1

Best Regards,

Shashank

0 Karma

PickleRick
SplunkTrust
SplunkTrust

There is a requirement in HTTP 1.1 RFC describing the User-Agent format

https://datatracker.ietf.org/doc/html/rfc7231#section-5.5.3

 


A user agent SHOULD send a User-Agent field in each request
   unless specifically configured not to do so.

     User-Agent = product *( RWS ( product / comment ) )

   The User-Agent field-value consists of one or more product
   identifiers, each followed by zero or more comments (Section 3.2 of
   [RFC7230]), which together identify the user agent software and its
   significant subproducts.  By convention, the product identifiers are
   listed in decreasing order of their significance for identifying the
   user agent software.


Theoretically, a "product" and "version" here should be a "token", which means it should NOT include whitespace. And "comment" is any string contained within parentheses.

You have to remember though that it's "just" an RFC and User-Agent is a user-side supplied value so you can have anything in there but you might probably classify all those outliers as "other".

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

There is no universal standard adopted by all browsers for the format of the user agent string, so any set of regex to extract this is likely to be incomplete at best.

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...