Splunkers
I'm trying to detect when a user fails GT 5 times in time range of one hour for last 24h, and i have the splq below, but i would like to have an opinion from community if any other option is to use splq logic to do the same?
SPLQ Used
index=VPN_Something
| bin _time span=24h
| stats list(status) as Attempts, count(eval(match(status,"failure"))) as Failed, count(eval(match(status,"success"))) as Success by _time user
| eval "Time Range"= strftime(_time,"%Y-%m-%d %H:%M")
| eval "Time Range"= 'Time Range'.strftime(_time+3600,"- %H:%M")
| where Failed > 5
Hi
you could find quite many examples for this with query
site:community.splunk.com%20login%20failed%20more%20than%205%20times%20per%20hour%20solved
just copy paste this to google.
In your example there is at least one misunderstanding. You have add “bin _time span=24h”, later you are expecting that your _time is divided into 1 hour span. But just google those examples and then change your query or just create a new.
With splunk there is rarely only one correct solution!
Happy splunking!
Hi @CyberWolf,
There's a tendency among practitioners to bin time into buckets rounded to the nearest time interval, e.g. 1 hour: 00:00, 01:00, 02:00, etc.; however, this results in counting errors. Instead, count using a rolling window in ascending _time order:
index=VPN_Something status=failure
| stats count by _time user
| streamstats time_window=1h sum(count) as failure_count by user
| where failure_count>5
Since you're only interested in a 1-hour window, your search time range only needs span the last hour plus any allowance for ingest lag and a buffer to accommodate your scheduling interval.
See https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Streamstats for information on scaling streamstats.
If you're using accelerated data models or indexed fields or if your raw events are structured in key-value pairs separated by minor breakers, you can use tstats to greatly improve the performance of the search.
If you have the capacity, you might also consider a real-time search that counts events as they're indexed, although the results may be incorrect relative to your requirements. If you have Splunk Enterprise Security, look at the "Access - Excessive Failed Logins - Rule" correlation search. For reference, it's a real-time search scheduled every 5 minutes (*/5 * * * *), with earliest=rt-65m@m and latest=rt-5m@m:
| from datamodel:"Authentication"."Failed_Authentication" | stats values("tag") as "tag",dc("user") as "user_count",dc("dest") as "dest_count",count by "app","src" | where 'count'>=6
In the Authentication data model, app would be something like vpn, and src would be a device identifier.
As @isoutamo wrote, there are many approximate solutions to this problem. The correct solution depends on your requirements and your tolerance for counting errors.
Hi
you could find quite many examples for this with query
site:community.splunk.com%20login%20failed%20more%20than%205%20times%20per%20hour%20solved
just copy paste this to google.
In your example there is at least one misunderstanding. You have add “bin _time span=24h”, later you are expecting that your _time is divided into 1 hour span. But just google those examples and then change your query or just create a new.
With splunk there is rarely only one correct solution!
Happy splunking!