Splunk Search

Streamstats with time window

gschmitz
Path Finder

Hi,
a question about streamstats as described here:
http://docs.splunk.com/Documentation/Splunk/5.0.1/SearchReference/Streamstats
It works out like described, but my query requires me to look at events based on a time range instead of a fixed number. Do you now of a way to manage that?
In other words, I want an accumulated count of all events 24 hours before the one I'm looking right now. The indexing volume might be a good example.

If window could accept the syntax from earliest or latest, that would be awesome and look like:
streamstats avg(foo) window=-24h

EDIT: Maybe join can help, but I couldn't make _time of the parent query be an input to earliest and latest of the subsearch.

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

This is not strictly what you describe, but may do as a workaround.

Consider two steps. First, you count or sum using a timechart (or bin and stats, if you prefer). Second, you use streamstats with an integer window since you now know the number per 24 hours.

In your example you mentioned avg(foo), in such a case you need to think about the loss of information when doing two steps of averages. For example, if you bin by minutes and have ten events in one minute but one in the other then the single event will be weighted much more than the ten events in the final average. One solution would be to keep a sum and a count, and at the very end compute the average yourself.

View solution in original post

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

This is not strictly what you describe, but may do as a workaround.

Consider two steps. First, you count or sum using a timechart (or bin and stats, if you prefer). Second, you use streamstats with an integer window since you now know the number per 24 hours.

In your example you mentioned avg(foo), in such a case you need to think about the loss of information when doing two steps of averages. For example, if you bin by minutes and have ten events in one minute but one in the other then the single event will be weighted much more than the ten events in the final average. One solution would be to keep a sum and a count, and at the very end compute the average yourself.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Indeed, the point of a timechart is to have continuous values without missing buckets.

0 Karma

gschmitz
Path Finder

Sorry for not seeing this earlier. I expected Splunk to send an email for any replies.
How would this work for missing buckets? Window only means the number of events and timechart doesn't render empty buckets, doesn't it?
EDIT: Nevermind, timechart does return zero here, so this is a solution you can use down to 1s resolution!
Thx!

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

How about this?

...  | timechart span=1m count, sum(field) as sum_field | streamstats window=1440 sum(count) as total_count, sum(sum_field) as total_sum

You'll get the total_count and total_sum for the previous 24 hours before any minute you like. Using that you can eval the floating 24h average.

0 Karma

gschmitz
Path Finder

Thank you for the idea. While it works as an approximation very well, I'm still wondering how would you proceed about the kepping a count and sum.
|bucket kb, count(*) span=1d did not work out for me at least.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...