Here's what I got so far:
index="myindex" (host="192.168.0.100" OR host="192.168.0.101") (msg="login OK" OR msg="login FAILED")
| transaction user maxspan=30s endswith="login OK"
| eval FailedLogons=eventcount-1
| where msg="login FAILED" AND FailedLogons >= 3
| table _time user FailedLogons
For example, a user's account quickly fails to logon 3 times, then successfully logs on. I consider this strange activity and want to track it. This query gets me mostly there; however, assumes all events before "login OK" are failures (which may not always be the case). Transaction
combines msg
to where there's only one value of each; however, _raw
contains the combined fields.
Is there a way to look for 3 consecutive failures that end with success?
transaction
is almost never the answer. This is a job for streamstats
index="myindex" (host="192.168.0.100" OR host="192.168.0.101") (msg="login OK" OR msg="login FAILED")
| rename COMMENT as "mark logons and then number them, remembering their time"
| eval OK_flag=case(msg="login OK",1)
| streamstats sum(OK_flag) as OK_group last(eval(case(OK_flag=1,_time))) as next_logon by user
| rename COMMENT as "for each group, count the number of fails"
| eval fail_flag=case(msg="login FAILED",1)
| eval fail_time=case(msg="login FAILED",_time)
| eventstats max(next_logon) as next_logon sum(fail_flag) as fail_count by user OK_group
| rename COMMENT as "drop all groups with less than 3 fails"
| where fail_count >=3
| rename COMMENT as "you can stop here, or analyze a little more, since we haven't limited the time of fails to 30 sec"
| where (next_logon < _time+30)
| rename COMMENT as "roll up each group"
| stats sum(fail_flag) as fail_count min(fail_time) as first_fail_time max(fail_time) as last_fail_time values(_time) as _time max(next_logon) as logon_time by user OK_group
| convert ctime(*_time)
The above is aircode, but it should be pretty close .
Why is transaction "almost never the answer"?
@tmontney -
transaction
is highly expensive, and almost every use case the same search can be done better with less resources by proper use of stats
, streamstats
and/or eventstats
.
Plus, if you engineer your search to use exactly those, then you know exactly what it is doing.
The Splunk Security Essentials app has a example query that does something similar. Look for the "Brute Force Login Attempts" use case.
I've always been meaning to look at this, good idea.