Splunk Search

average count by day

hartfoml
Motivator

I have a search looking for the events I want to look at. Then i want to have the average of the events per day.

I only want the average per day number so that I can alert if it is over or under the average

I have like this search event=foo | stats avg(count) by date_day

What am i doing wrong??

Tags (1)
1 Solution

lguinn2
Legend

I think that you want to calculate the daily count over a period of time, and then average it. This is two steps:

search event=foo
| bucket _time span=1d
| stats count by _time
| stats avg(count) as AverageCountPerDay

I did not rely on date_mday - what if your time range was larger than a month? What if your data doesn't have the date_mday field? Instead, I used the bucket command to set the internal time _time to a one-day span, and counted by that. So, I guess that made it 3 steps.

View solution in original post

royswapan
Engager

Hi,
Is these a way to get "hourly" AVG of 30 days ?

Thanks,
Roy

0 Karma

richgalloway
SplunkTrust
SplunkTrust

@royswapan This is an old thread with an accepted answer so you're unlikely to get a response. Please post a new question.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

colinmchugo
Explorer

HI,

Is there a way of showing the percentage increase or decrease from the commands "stats count as daycount by date_mday | streamstats avg(daycount)" so you can see on a visualization if the events for that day are above or below average? thanks in advance.

Colin

0 Karma

richgalloway
SplunkTrust
SplunkTrust

This is an old thread with an accepted answer so you're unlikely to get a response. Please post a new question.

---
If this reply helps you, an upvote would be appreciated.

colinmchugo
Explorer

thanks richgalloway

0 Karma

lguinn2
Legend

I think that you want to calculate the daily count over a period of time, and then average it. This is two steps:

search event=foo
| bucket _time span=1d
| stats count by _time
| stats avg(count) as AverageCountPerDay

I did not rely on date_mday - what if your time range was larger than a month? What if your data doesn't have the date_mday field? Instead, I used the bucket command to set the internal time _time to a one-day span, and counted by that. So, I guess that made it 3 steps.

View solution in original post

akarivaratharaj
Communicator

I need the count and average of my field to be displayed for every 1 minute. Any suggestions please?

Note: It is not a numeric field

0 Karma

hartfoml
Motivator

this is the one I needed thank much

0 Karma

richcollier
Path Finder

good point on date_mday

0 Karma

richcollier
Path Finder

First of all, the internal field is date_mday. Also, using streamstats will update the average as it goes along. For example:

sourcetype="access_combined_wcookie" status=503 | stats count as daycount by date_mday | streamstats avg(daycount)

output is:

date_mday   daycount         avg(daycount)
    15          2                2.000000
    16          5                3.500000
    18          2                3.000000
    19          2                2.750000
    20          4                3.000000

l0pher
Explorer

date_mday worked like a charm. Where can we find all such "internal" fields?

0 Karma

lguinn2
Legend

BTW, date_mday isn't an internal field - it is extracted from events that have a human-readable timestamp. So it isn't always available.

Also, why streamstats? It is a pretty resource-intensive command. If you want to see the individual days and the average, try eventstats instead. It will look different, though.

I like your solution that shows both the individual days and the average...

richcollier
Path Finder

One of the problems with this approach is defining a proper "over" or "under" the average. +/- 2 times the average? +/- 2 standard deviations? This approach of using avg and stddev is inaccurate if the count of the events in your data do not form a "normal distribution" (bell curve).

If ultimately your goal is to use statistics to learn "normal" behavior, and know when that behavior (count per day) is very different, then a more proper statistical modeling and anomaly detection approach is needed. See Anomaly Detective app: http://splunk-base.splunk.com/apps/68765/prelert-anomaly-detective

0 Karma

richcollier
Path Finder

sorry, there was an error in my answer, i fixed it

0 Karma

hartfoml
Motivator

This might be as simple as total number of events devided by 30 or there could be more complecated math that would through out the low and high, discount any zero values, etc. I was hoping there was a way to handle this in splunk???

0 Karma

hartfoml
Motivator

Hi Rich, Thanks so much for your comments. I found this on the Answers site but I did not know what I was looking at when I got the resultes. I got two collumbs of numbers. one columb was day_mday and the other was avg(count). the mday collumb had changeing numbers and the avg(count) columb had no numbers. Lets see if i can put this a better way for more help. I have events happening every day for 30 days. I want to know only one number (Average Count per day).

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!