Splunk Search

Rank data from web access files

ndoshi
Splunk Employee
Splunk Employee

I have web content (articles, stories) where each article is grouped in a category such as NEWS, STORY, etc. Website visitors are grouped by region. In each region, I want to be able to rank each category by the number of site visitors who read articles in a category.

I can get a count by region and category.

I can get a count by Region, VisitorID, Category.

However, I want to know how many site visitors had CAT1 as their most-read category. How many had CAT1 as their second most-read category? How many had CAT2 as their most-read category?

Here's an example:

**Region    VID      Category   # Visitors who ranked this 1st**
NY           87        STORY               10
NY           44        STORY                9
LA           98        NEWS                 4
1 Solution

eelisio2
Path Finder

Try aggregating by region, visitor id, and classification. Then, use streamstats to create the rank field.

sourcetype="mydata" geo_region=XXX | eval Classification=case(like(pagename, "home%"), "HOME",like(pagename, "%news%") AND like(pagename, "%story%"), "STORY",like(pagename, "%markets%"), "MARKETS",like(pagename, "%personalfinance%"), "PERSONALFINANCE",like(pagename, "%search%"), "SEARCH",like(pagename, "%news%"), "NEWS")|stats count as hitcount by geo_region, visid, Classification | sort geo_region,visid,-hitcount | streamstats count as rank by visid | stats count as rankcount by geo_region, Classification, rank

Breaking down the search command, here are the results from each section:

stats count as hitcount by geo_region, visid, Classification

geo_region      visid   Classification          hitcount
CA              100     HOME                    5
CA              100     NEWS                    10
CA              100     MARKETS                 7
CA              200     NEWS                    5
CA              200     HOME                    10

You want to sort in order of the number of hits most to fewest
sort geo_region,visid,-hitcount

geo_region      visid   Classification          hitcount
CA              100     NEWS                    10
CA              100     MARKETS                 7
CA              100     HOME                    5
CA              200     HOME                    10
CA              200     NEWS                    15

Use streamstats to count the number of rows for each visitor id. By starting with 1 for each visitor id, this creates a rank field.

streamstats count as rank by visid

geo_region      visid   Classification          hitcount        rank
CA              100     NEWS                    10              1
CA              100     MARKETS                 7               2
CA              100     HOME                    5               3
CA              200     NEWS                    15              1
CA              200     HOME                    10              2

Then, count the number of rows per Classification and rank

stats count as rankcount by geo_region, Classification, rank

Classification          rank            rankcount
NEWS                    1               2
MARKETS                 2               1
HOME                    2               1
HOME                    3               1

View solution in original post

eelisio2
Path Finder

Try aggregating by region, visitor id, and classification. Then, use streamstats to create the rank field.

sourcetype="mydata" geo_region=XXX | eval Classification=case(like(pagename, "home%"), "HOME",like(pagename, "%news%") AND like(pagename, "%story%"), "STORY",like(pagename, "%markets%"), "MARKETS",like(pagename, "%personalfinance%"), "PERSONALFINANCE",like(pagename, "%search%"), "SEARCH",like(pagename, "%news%"), "NEWS")|stats count as hitcount by geo_region, visid, Classification | sort geo_region,visid,-hitcount | streamstats count as rank by visid | stats count as rankcount by geo_region, Classification, rank

Breaking down the search command, here are the results from each section:

stats count as hitcount by geo_region, visid, Classification

geo_region      visid   Classification          hitcount
CA              100     HOME                    5
CA              100     NEWS                    10
CA              100     MARKETS                 7
CA              200     NEWS                    5
CA              200     HOME                    10

You want to sort in order of the number of hits most to fewest
sort geo_region,visid,-hitcount

geo_region      visid   Classification          hitcount
CA              100     NEWS                    10
CA              100     MARKETS                 7
CA              100     HOME                    5
CA              200     HOME                    10
CA              200     NEWS                    15

Use streamstats to count the number of rows for each visitor id. By starting with 1 for each visitor id, this creates a rank field.

streamstats count as rank by visid

geo_region      visid   Classification          hitcount        rank
CA              100     NEWS                    10              1
CA              100     MARKETS                 7               2
CA              100     HOME                    5               3
CA              200     NEWS                    15              1
CA              200     HOME                    10              2

Then, count the number of rows per Classification and rank

stats count as rankcount by geo_region, Classification, rank

Classification          rank            rankcount
NEWS                    1               2
MARKETS                 2               1
HOME                    2               1
HOME                    3               1
Get Updates on the Splunk Community!

Splunk Observability Synthetic Monitoring - Resolved Incident on Detector Alerts

We’ve discovered a bug that affected the auto-clear of Synthetic Detectors in the Splunk Synthetic Monitoring ...

Video | Tom’s Smartness Journey Continues

Remember Splunk Community member Tom Kopchak? If you caught the first episode of our Smartness interview ...

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud?

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud? Learn how unique features like ...