Splunk Search

How to combine top and bin in splunk query

JamesWierzba
Observer

I need to find abnormalities in my data. The data I have is individual views for certain movie titles. I need to find content that was abnormally popular over some small time interval, say 1 hour. And check a few weeks worth of data.

One option is to run a query manually for each hour

 

 

 

``` Run this over 60m time window ```
index=mydata 
| top limit=100 movieId

 

 

 

Obviously I don't want to run this query 24 * 7 = 168 times for one weeks worth of data.

How can I bin the data into time buckets, and get a percentage ratio by movieId? This is what I came up with:

 

 

 

``` Run this over 1+ week ```
index=mydata
| bin span=60m _time
| top limit=100 movieId, _time

 

 

 

This does not help me because the output of `top` is showing me a percentage based on the entire input set of data. I need a "local" percentage, i.e. a percentage based on only that slice of data in the bin.

 

I'm wondering if eventstats or streamstats can be useful here but I was not able to come up with a query using those commands

Labels (1)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

So rather than use top, which is somewhat useful, it is generally easier to get results using stats

index=mydata
| bin _time span=1h
| stats count by _time movieId
| eventstats sum(count) as total by _time
| eval percent=round(count/total*100, 2)

so this search will give you 1h buckets with counts by movieId then the eventstats will calculate the total per hour of all movies and then the percent calc will get the percentage of each movie within that hour.

However, although efficient, I expect what you may want is a streamstats variant, which will give you a sliding 60 minute window, so if your peak for a movie is from 20:30 to 21:30 this will show using streamstats, but not necessarily stats by 1h buckets

You could do something like this

index=mydata
| streamstats time_window=1h count as userCount by movieId
| streamstats time_window=1h count as totalCount
| eval percent=round(userCount/totalCount*100, 2)
| timechart span=1h max(percent) as maxPercent by movieId

which will show the max percent for all movies in 1h buckets, but with the time calculated as a sliding window

You can then test for thresholds or further manipulate your data - the timechart above is one way of looking at it, but you can do anything from there

0 Karma
Get Updates on the Splunk Community!

See your relevant APM services, dashboards, and alerts in one place with the updated ...

As a Splunk Observability user, you have a lot of data you have to manage, prioritize, and troubleshoot on a ...

Index This | What goes away as soon as you talk about it?

May 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this month’s ...

What's New in Splunk Observability Cloud and Splunk AppDynamics - May 2025

This month, we’re delivering several new innovations in Splunk Observability Cloud and Splunk AppDynamics ...