Splunk Search

How to streamstats with time_window and keep only the largest count?

dedupper
Explorer

I'm trying to use the streamstats-command with time_window to track when certain user actions happen more than twice in a span of an hour. My search is like this ("dedup _time" because we get duplicate rows)

<search>
| sort _time
| fields _time user
| dedup _time
| streamstats time_window=60min count as amount by user
| where amount > 2
| table _time user amount

This search works correctly otherwise, but the problem is that it triggers multiple times when the user action happens more than twice in an hour. So I get results like:

2022-01-12 16:04:56.482 username1 3
2022-01-12 16:07:58.525 username1 4
2022-01-12 16:13:16.137 username1 5
2022-01-12 16:14:30.255 username1 6

So how can I get only the largest result (in this case 6) in a sequence like this? I can't use "dedup user" because the alert may trigger for the same user at some other time as well, and that should be reported as its own case.

Any help is greatly appreciated.

Labels (1)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

It's a tricky one and I can't find a reasonable solution with streamstats.

But you can do it another way.

<your search>
| transaction maxspan=1h maxevents=-1
| table _time eventcount

I must admit though that it's an ugly solution (and transaction command has its limits).

And you might want to resort your events from earliest to lates instead of the default latest to earliest order.

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

I'd be cautions with dedup (interesting - that's another topic today with dedup :-)). If you only do dedup on _time you might lose events for different users at the same time (applicability of course depends on the resolution of your _time). If you have simply duplicated events, you should rather use "dedup _time user".

And about the streamstats - well, it just works this way. You're asking for the count over a sliding window so it calculates it for each event over range-size window "backwards" in time.

And it's not obvious how those logins should be counted if - for example - a user logs in every 10 minutes over a span of 5 hours. What result would you expect?

dedupper
Explorer

"a user logs in every 10 minutes over a span of 5 hours. What result would you expect?"

That's a good point. Thinking about it, I would perhaps want 5 alerts of six events each - like binning by 60min, but only if the alert triggers. So, if the alert triggers, create a bin from the time of the first event in the sequence to +60 min. After that start running the streamstats again. I don't know if this is possible in Splunk though.


And thank you for pointing out the problem with dedup.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

It's a tricky one and I can't find a reasonable solution with streamstats.

But you can do it another way.

<your search>
| transaction maxspan=1h maxevents=-1
| table _time eventcount

I must admit though that it's an ugly solution (and transaction command has its limits).

And you might want to resort your events from earliest to lates instead of the default latest to earliest order.

dedupper
Explorer

This solution works, thank you for your help!

We only have a couple thousand of these events in the index in total, so memory usage should not be a problem.

0 Karma
Get Updates on the Splunk Community!

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Splunk Education Goes to Washington | Splunk GovSummit 2024

If you’re in the Washington, D.C. area, this is your opportunity to take your career and Splunk skills to the ...