Splunk Enterprise

How to extract using Browser Version from useragent

shashank_24
Path Finder

Hi, Is there a easy and straight forward way of extracting browser versions from access logs using Useragent string.

I've a requirement where I have to list out top browsers and top versions of the browser. I was able to manage to extract the browser using the below eval expression  but getting the browser versions are tricky.

 

 

 

| eval browser = case(match(useragent,"Firefox"),"FireFox", match(useragent,"Chrome") AND NOT match(useragent,"Edge"),"Chrome", match(useragent,"Safari") AND NOT match(useragent,"Chrome"),"Safari", match(useragent, "MSIE|Trident|Edge"), "IE", NOT match(useragent, "Chrome|Firefox|Safari|MSIE|Trident|Edge"), "OTHERS")

 

 

 

Has someone done that before and help me steer into right direction. I can't install any app so it has to be done via some regex. Please let me know if someone can help. Very much appreciated in advance

Some examples of Useragent Strings -

Mozilla/5.0 (Linux; Android 9; ANE-LX1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.87 Mobile Safari/537.36

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.3 Safari/605.1.15

Mozilla/5.0 (Linux; Android 9; ANE-LX1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.87 Mobile Safari/537.36

Mozilla/5.0 (iPhone; CPU iPhone OS 15_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) GSA/198.0.425262635 Mobile/15E148 Safari/604.1

Best Regards,

Shashank

0 Karma

PickleRick
SplunkTrust
SplunkTrust

There is a requirement in HTTP 1.1 RFC describing the User-Agent format

https://datatracker.ietf.org/doc/html/rfc7231#section-5.5.3

 


A user agent SHOULD send a User-Agent field in each request
   unless specifically configured not to do so.

     User-Agent = product *( RWS ( product / comment ) )

   The User-Agent field-value consists of one or more product
   identifiers, each followed by zero or more comments (Section 3.2 of
   [RFC7230]), which together identify the user agent software and its
   significant subproducts.  By convention, the product identifiers are
   listed in decreasing order of their significance for identifying the
   user agent software.


Theoretically, a "product" and "version" here should be a "token", which means it should NOT include whitespace. And "comment" is any string contained within parentheses.

You have to remember though that it's "just" an RFC and User-Agent is a user-side supplied value so you can have anything in there but you might probably classify all those outliers as "other".

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

There is no universal standard adopted by all browsers for the format of the user agent string, so any set of regex to extract this is likely to be incomplete at best.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...