Splunk Search

Need help with regex

theouhuios
Motivator

Hello

I am trying to get the browser information from the below raw data and haven't been able to do so. Can anyone please explain how to get the information? I haven't yet been able to successfully write complex regex expressions.

2012-11-26 19:41:42  10.64.182.218 GET /_js/mbox.js - 80 - 10.64.182.224 Mozilla/4.0+(compatible;+MSIE+8.0;+Windows+NT+5.1;+Trident/4.0;+.NET+CLR+1.1.4322;+.NET+CLR+2.0.50727;+InfoPath.2;+.NET+CLR+3.0.4506.2152;+.NET+CLR+3.5.30729;+MS-RTC+LM+8;+.NET4.0C;+.NET4.0E)

Regards

theou

Tags (1)
0 Karma

tpederson
Path Finder

I couldn't find a definitive list of permissible characters for user agent strings. So, as long as all log entries are the same you can try this regex:

\S*$

That just means anything that's not a space at the end of the log entry. Since the parts of the log entry are delineated by spaces, you should be good to go with that. Otherwise you can try something like:

Mozilla[\.\d\w:;+/()-]*

Which is "Mozilla" followed by all the characters I found in example user agent strings. Also, try :

Mozilla[^\s]*

Which just means anything not a space following "Mozilla".

Regex is complicated but powerful, its worth learning.

0 Karma

Ayn
Legend

This app could very well be exactly what you're looking for. http://splunk-base.splunk.com/apps/48017/ta-uas_parser

0 Karma

Ayn
Legend

Sorry, the task of making sense out of user agent strings is ridiculously complex, because there's simply no universal standard for how they're formatted. The web analytics app might have some inbuilt support for this.

0 Karma

theouhuios
Motivator

Oh. I get it now what you meant. Any idea on how to approach this?

0 Karma

Ayn
Legend

That's my point - if you just catch the initial "Mozilla" you won't be able to differentiate between browsers at all. Both Opera and Internet Explorer commonly use "Mozilla" at the beginning of their user-agent string.

0 Karma

theouhuios
Motivator

That's fine. In the whole raw data there are few in Opera and Internet Browser too. I just need to make a table to determine which browsers where the most used.

0 Karma

Ayn
Legend

You do know that pretty much all browsers use "Mozilla" in their user-agent string? http://en.wikipedia.org/wiki/User_agent#Format

0 Karma

theouhuios
Motivator

Nope. Just the browser info. In this data only Mozilla.

0 Karma

Ayn
Legend

Which info do you want? The whole user-agent string?

0 Karma