I am trying to get the browser information from the below raw data and haven't been able to do so. Can anyone please explain how to get the information? I haven't yet been able to successfully write complex regex expressions.
2012-11-26 19:41:42 10.64.182.218 GET /_js/mbox.js - 80 - 10.64.182.224 Mozilla/4.0+(compatible;+MSIE+8.0;+Windows+NT+5.1;+Trident/4.0;+.NET+CLR+1.1.4322;+.NET+CLR+2.0.50727;+InfoPath.2;+.NET+CLR+3.0.4506.2152;+.NET+CLR+3.5.30729;+MS-RTC+LM+8;+.NET4.0C;+.NET4.0E)
I couldn't find a definitive list of permissible characters for user agent strings. So, as long as all log entries are the same you can try this regex:
That just means anything that's not a space at the end of the log entry. Since the parts of the log entry are delineated by spaces, you should be good to go with that. Otherwise you can try something like:
Which is "Mozilla" followed by all the characters I found in example user agent strings. Also, try :
Which just means anything not a space following "Mozilla".
Regex is complicated but powerful, its worth learning.
Sorry, the task of making sense out of user agent strings is ridiculously complex, because there's simply no universal standard for how they're formatted. The web analytics app might have some inbuilt support for this.
That's my point - if you just catch the initial "Mozilla" you won't be able to differentiate between browsers at all. Both Opera and Internet Explorer commonly use "Mozilla" at the beginning of their user-agent string.