Hello,
I am trying to use Streamstats with Sum(value) and I want to reset that sum after it reaches a certain threshold (in the example below this threshold is 1000). The Splunk documentation states that "The eval-expression can reference fields that are returned by the streamstats command. "
However it is completely ignoring my reset_after clause, any help is much appreciated. Example below:
index=events
|table id,_time
|sort 0 -id, _time
|streamstats current=f reset_on_change=true last(_time) as last_seen by id
|eval time_delta=_time-last_seen
|sort 0 -id, _time
|streamstats reset_after="("running_time>=1000")" reset_on_change=true sum(time_delta) as running_time by id
Splunk reference:
reset_afterSyntax: reset_after="("<eval-expression>")"Description: After the streamstats calculations are produced for an event, reset_after specifies that all of the accumulated statistics are reset if the eval-expression returns true. The eval-expression must evaluate to true or false. The eval-expression can reference fields that are returned by the streamstats command. When the reset_after argument is combined with the window argument, the window is also reset when the accumulated statistics are reset.
Are you implying the first streamstats pipe is interfering with the second stremstats pipe ?
The output of the first streamstats is exactly what I expect, and i need the current=f so because time_delta is the difference between the time of an event and the event before it.
Sorry, that was me misreading your code. Sometimes it is not easy to distinguish code from prose. It usually makes it clearer by including code in a code block</>
Updated to code block 🙂
reset_on_change is overriding reset_after - reset_on_change operates on the value of the field(s) in the by clause
Thank you for the response!
I removed the reset_on_change clause, however it is still ignoring my reset_after, the aggregate "running_time" grows past 1000 and keeps going for all rows. The reason I included the reset_on_change is because i also need it to reset after the id changes.
It looks like the current=f is what is causing the issue - do you need that? Or can you subtract the time_delta from the running_time afterwards?
Continued thanks for the help
The current=f in the first streamstats pipe lets me determine the difference between an event the immediate preceeding event (time_delta), which is why i do the sort. The output of the first streamstats is exactly what I expect, and the second streamstats does not have current=f, so i'm confused if there is interplay between the two streamstats
I don't think you need the reset_on_change=t in the second streamstats because you have current=f in the first streamstats making time_delta null which effectively resets running_time.
However, this doesn't explain why reset_after isn't working - looks like a bug?
Definitely, looks like a bug. Here is a runanywhere example demonstrating the issue (although it uses random ids so may not show it every time).
| gentimes start=-1 increment=1h
| rename starttime as _time
| fields _time
| eval id=random()%3
| sort 0 -id _time
| streamstats current=f reset_on_change=t last(_time) as last_seen by id
| eval time_delta=_time-last_seen
| sort 0 -id _time
| streamstats reset_after="(sum(time_delta) > 10000)" sum(time_delta) by id
I appreciate the investigation, too bad it's a bug, essentially what im looking to accomplish is this, if you have any alternate ideas i would appreciate it.
starting dataset (this is after the sort)
_time | ID |
100 | 1 |
1000 | 1 |
20000 | 1 |
22000 | 1 |
100 | 2 |
400 | 2 |
5000 | 2 |
5900 | 2 |
7900 | 2 |
desired output:
_time | ID | last_seen | time_delta | running_time |
100 | 1 | null | null | 0 |
1000 | 1 | 100 | 900 | 900 |
20000 | 1 | 1000 | 19000 | 19900 |
22000 | 1 | 20000 | 2000 | 2000 ```this was reset since 1000 threshold crossed at 19900``` |
100 | 2 | null | null | 0 ```reset because the ID changed``` |
400 | 2 | 100 | 300 | 300 |
5000 | 2 | 400 | 4600 | 4900 |
6100 | 2 | 5000 | 1100 | 1100 ```this was reset since 1000 threshold crossed at 4900``` |
7900 | 2 | 6000 | 1900 | 1900 ```this was reset since 1000 threshold crossed at 1100``` |
OK it looks like it is because time_delta has a null value - try inserting this before the second streamstats
| fillnull value=0 time_delta
I think this is working!