Using timechart, is there a way to calculate mean/...

jhochstetler · ‎08-04-2017

I have a process that experiences about 8640 events per day, or what I would expect to be an average of 0.1 events per second. Multiple events could occur in the same second, but overall, there are about 8500ish events per day.

Example search:

search status=200 earliest=7/1/2017:00:00:00 searchtimespandays=1 | eval _time=floor(_time)  | stats count by _time |  | timechart span=1d count mean(count) min(count)

The "eval _time=floor(_time)" serves to round downward each event's time to the nearest second.

The "stats count by _time" then computes, for each second, how many events occurred within that second.

What I have seen online suggests that a timechart command would allow computation of max, mean, etc. across the entire timespan (1d).

What I would expect here is something like the following:

_time   count   mean(count)
2017-07-01  8141    0.094224

But what I am getting is:

_time   count   mean(count)
2017-07-01  8141    1.091021

So it appears that I am getting averages that are only including the seconds for which there is at least one event. In my case, I need to also include the zero-event seconds, which this doesn't appear to be doing.

I tried table | makecontinuous span=1s, but that complained that too many rows (> 50000) were being generated.

Is there a way to calculate mean/max/min, etc while including the 'zero-event' seconds?

Thanks!

DalJeanis · ‎08-04-2017

You are running into a problem because you did not rename count in the stats command. Change to...

 | stats count as mycount by _time 
 | timechart span=1d count mean(mycount) min(mycount)

jhochstetler · ‎08-07-2017

The renaming doesn't seem to change anything for my queries. With or without renaming, the mean(count) or mean(mycount) comes out greater than 1.

Is there something else I'm missing?

Thanks!

DalJeanis · ‎08-07-2017

Okay, try this

| stats count as mycount by _time 
| appendpipe 
    [|stats min(_time) as mintime, max(_time) as maxtime 
     | eval mytime=mvrange(mintime,maxtime+1) 
     | table mytime 
     | eval mycount=0 
     | mvexpand mytime 
     | rename mytime as _time 
     ]
| stats sum(mycount) as mycount by _time 
| eval day=strftime( _time,"%Y-%m-%d")
| stats count mean(mycount) as avgcount  by day

And if the above results look right, then replace the last two lines with ...

| timechart span=1d count mean(mycount) as avgcount

jhochstetler · ‎08-07-2017

This appears to work for the first 2 hours or so of the timeframe. I think appendpipe is limited to ~10K returns, so it's not computing accurately past the first 2.7 hours or so. 10000/3600 = 2.7 ?

Using timechart, is there a way to calculate mean/max/min, etc while including the 'zero-event' seconds?

Developer Spotlight with Paul Stout

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

Data-Driven Success: Splunk & Financial Services