Security

Distribution of Browser types based on the Apache logs

nikhilkotagiri
Explorer

Presently we are indexing our Apache Access Logs by using splunk ,from the number of events occurred in the logs we want to know the % browser types.

For example from the logs we need to find the % uses of Mozilla ,MSIE etc ..browsers (X% Mozilla ; Y% MSIE ; etc )

We tried this one by using extract filed option but it's not working as we expected.

Do we have any special app for the Apache Logs?

Thanks in Advance.

Nikhil

Tags (1)

gkanapathy
Splunk Employee
Splunk Employee

You should make sure the data is indexed with sourcetype "access_combined", which will provide most of the extractions you need. You could of course make your own or duplicate that one, but I'm not sure you've had luck.

If you've done that then

sourcetype=access_combined | top useragent

will do it. Of course, this will give the full User-Agent header value. You can decode this, or you can try to classify it to whatever level of detail you like by constructing a table, perhaps using a resource like: http://www.useragentstring.com/pages/Browserlist/ or some other table, or by using | rex to further parse the useragent field.

nikhilkotagiri
Explorer

Thanks for the help.

can you please provide me the an example of the reg ex expression to get only the browser type . i am trying to generate this by using the extract field options but it's throwing an errors for us.

Invalid regex: no named extraction at position 5 (i.e., "(compatibl..."). Expected "(?P<variable>pattern)"

Thanks,
Nikhil

0 Karma

splunk
Splunk Employee
Splunk Employee

N :

A typical Apache weblog contains all the relevant user OS and browser make/model info. If it's in the logs, you'll be able to construct a search in Splunk for it without the use of a special 'Apache' Splunk app.

The pertinent user agent information in an apache log looks something like this:

"Mozilla/5.0 (Windows; U; Windows NT 6.0; en-us; rv:1.9.2.3) Gecko/20100401 WINFC 2.0.0.1 Firefox/3.6.3 (.NET CLR 3.5.30729)"

Where:

  • Mozilla/5.0 is the family of browser (the 5.0 is some legacy netscape business)

  • (Windows; U; Windows NT 6.0; en-us; rv:1.9.2.3) is the general OS and build information: OS, Country Code, Platform, Language Variant, rendering engine version number

  • Gecko/20100401 is the rendering engine and build date

  • Firefox/3.6.3 is the actual browser version used

If you're splunking all of your Apache logs, I believe 'useragent' should be an automatically extracted field. Check the 'other interesting fields' section in the light blue field-picking pane within the search app. Clicking through the useragent field will give you the most granular expression of this information, but you'll want to present your results in a much more general format to make an adequate comparison of browser types.

I would start the search with eventtype=pageview to restrict the results to only pageview activity. From there i'd be comparing the most general terms - events that include 'firefox' 'safari' etc.

This probably doesn't answer your question since I can't think of the correct search off the top of my head, but hopefully this gives you some context and points you in the right direction.

.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!