Reporting

Get top pages per time period

spammenot66
Contributor

Splunk export has a limitation of 10000 rows. How do i get 5 top pages (either by visit or pageviews) per day? Currently I'm using a data model. For example sake, lets say i there's more than 10000 visited pages for any given day. If i ran the query below with a pipe of |head 5 and ran the search for a full year, it wouldn't give me top 5 pages per day for a full year. Instead it would give me top 5 rows from all pages and days.

 |pivot MainSites HomePages dc(User) AS "DistinctUser"   dc(VisitSession) AS "DistinctVisits" sum(wtPageView) AS "TotalPageViews"  sum(Click) AS "TotalClicks"  SPLITROW TimePeriod_Daily  SPLITROW eventURL  | sort - "TotalPageViews"
0 Karma

spammenot66
Contributor

For anyone who is looking to do something similiar - get top x per y when dealing with a data model/pivot, you'll want to run a subsearch similiar to the one below. You'll get an extra column called sum(HomePages) which is the equivalent of total hits for the current time period.

|pivot MainSites HomePages SPLITROW TimePeriod_Daily  | appendcols  [pivot MainSites HomePages dc(User) AS "DistinctUser"   dc(VisitSession) AS "DistinctVisits" sum(wtPageView) AS "TotalPageViews"  sum(Click) AS "TotalClicks"  SPLITROW TimePeriod_Daily  SPLITROW eventURL  | sort - "TotalPageViews"| head 5]
0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

I don't think that is going to do what you think it does. Using a subsearch with 2 pivot commands is not a good practice to start, since using datamodel is the preferred method any way.

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

Without knowing the specifics of your Data Model, I can give you a "normal search" that gives you what you want, and then perhaps you can apply it to your data model.

<find_your_web_logs> | bin span=1d _time | top limit=5 VisitSession by _time

This will split your data in 1 day chunks, and then do a top with the split by of _time, which then gives the top 5 per day.

Sticking it into datamodel land (using some assumptions):

|datamodel <datamodel> <object> search | bin span=1d _time | top limit=5 VisitSession by _time

spammenot66
Contributor

If i used | top limit=5 VisitSession by _time, my result will look something like
_time VisitSession count percent

This tends to be completely useless if i want the top 5 eventURLs per month for the span of 12 months.
Time Period eventURL DistinctUser DistinctVisits TotalPageViews TotalClicks
month1 URL1 10 50 100 0
month1 URL2 9 40 100 0
month1 URL3 7 31 100 30
month1 URL4 6 30 100 20
month1 URL5 5 20 100 10
month2 URL1 10 50 100 0
month2 URL2 9 40 100 0
month2 URL3 7 31 100 30
month2 URL4 6 30 100 20
month3 URL5 5 20 100 10

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...

SPL2 Deep Dives, AppDynamics Integrations, SAML Made Simple and Much More on Splunk ...

Splunk Lantern is Splunk’s customer success center that provides practical guidance from Splunk experts on key ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...