Splunk Search

How to create a report to get session based user-agent statistics on a per site basis (from Tomcat logs)?

michael_sleep
Communicator

Hey there,

I've been learning how to use the search features in Splunk and trying to find a way to get some user-agent metrics from all of our aggregated Tomcat log data. There are 3 key fields that I'm looking at to determine how many browsers are hitting different websites that we run:

site (this is the domain URL from the logs, consider the format to be blahblah.companyname.com)
tr_cookie (a tracking cookie with a unique id per session)
http_user_agent (the user agent we pull from each log entry)

What I need to do is generate a report that counts how many http_user_agents exist using the tr_cookie as a source (this cookie will never have two different http_user_agent values) and then break that down into statistics on a per 'site' basis. I have two searches so far:

index=tomcat_logs  "GET /company/index.php " | stats dc(tr_cookie) as sessions by http_user_agent

The above gives me the overall breakdown of how many unique sessions can be attributed to different http_user_agents. So it will say that 100 people are using ie8. 50 people are using Firefox and etc, ex:

Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko    541
Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko   198
Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.94 Safari/537.36  172

But what I really need is something like this:

somecompany.companyname.com:
     Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko   300
     Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko   100
     Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.94 Safari/537.36 75

someothercompany.companyname.com:
     Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko   200
     Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko   110
     Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.94 Safari/537.36 63

The closest that I have gotten is this search, but it just lists the statistics line by line with the "site" line repeating over and over:

 index=tomcat_logs  "GET /company/index.php " | stats dc(SKP_TS) as sessions by http_user_agent, site

Example:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)  XXXXX.companyname.com   1
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)  XXXXX.companyname.com   1
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729) XXXXX.companyname.com   1
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 3.0.04506.30; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)   XXXXX.companyname.com   2
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 10.0; WOW64; Trident/7.0; .NET4.0C; .NET4.0E; InfoPath.3; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729) YYYYY.companyname.com   1
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 10.0; WOW64; Trident/7.0; Touch; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729; Tablet PC 2.0)   ZZZZZ.companyname.com   1

Any idea how to make this happen?

0 Karma
1 Solution

lycollicott
Motivator

This works.

| search .....
| stats dc(SKP_TS) as sessions by site, http_user_agent
| streamstats current=false last(site) as previousSite 
| eval site=if(match(site,previousSite),"",site)
| fields site, http_user_agent, sessions

I used sideview's answer from https://answers.splunk.com/answers/25102/question-regarding-grouping-of-results-into-a-table.html

View solution in original post

0 Karma

lycollicott
Motivator

This works.

| search .....
| stats dc(SKP_TS) as sessions by site, http_user_agent
| streamstats current=false last(site) as previousSite 
| eval site=if(match(site,previousSite),"",site)
| fields site, http_user_agent, sessions

I used sideview's answer from https://answers.splunk.com/answers/25102/question-regarding-grouping-of-results-into-a-table.html

0 Karma

michael_sleep
Communicator

That does indeed the give the effect I'm looking for.

0 Karma

woodcock
Esteemed Legend

Like this:

... | stats dc(SKP_TS) AS sessions BY site http_user_agent
0 Karma

michael_sleep
Communicator

Also close but it looks like this:

 Site:                        User-Agent:            Sessions: 
 blah.companyname.com                 Internet Explorer 8        3
 blah.companyname.com                       Internet Explorer 9            2
 blah.companyname.com                       Internet Explorer 10            5

 junk.companyname.com                 Internet Explorer 8        2
 junk.companyname.com                        Internet Explorer 9            6
 junk.companyname.com                         Internet Explorer 10            5
0 Karma

sundareshr
Legend

Try this

... | stats values(http_user_agent) as UA dc(SKP_TS) as sessions by site
0 Karma

michael_sleep
Communicator

That's somewhat closer but what that returns is the total count of unique SKP_TS sessions for each site, along side each unique value of the user-agent. What I need is the unique count of SKP_TS sessions per user-agent per site.

So instead of:

Site:                       User-Agent:         Sessions: 
blah.companyname.com    Internet Explorer 8     10
                        Internet Explorer 9
                        Internet Explorer 10

junk.companyname.com    Internet Explorer 8     13
                        Internet Explorer 9
                        Internet Explorer 10

It should be:

Site:                       User-Agent:         Sessions: 
blah.companyname.com    Internet Explorer 8     3
                        Internet Explorer 9         2
                        Internet Explorer 10            5

junk.companyname.com    Internet Explorer 8     2
                        Internet Explorer 9         6
                        Internet Explorer 10            5
0 Karma

michael_sleep
Communicator

I know the syntax doesn't make sense but in my mind it's like this:

index=tomcat_logs  "GET /company/index.php " | (stats dc(tr_cookie) as sessions by http_user_agent) by site

Essentially using by twice.

0 Karma
Get Updates on the Splunk Community!

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...