Detecting Spikes/Anomalies in Failed Logins - over...

cyberdiver · ‎08-20-2021

My goal is to calculate a score of confidence based on how anomalous the amount of failed logins is compared to activity over a 30 day period. Then, I want to sort those scores in a table showing the users, maybe the relative time of spike, and the average number of failed logins at that time. That way I can tune thresholds and what not.

This is what I've tried up until now. Even some "pseudocode" would help here. I understand that the way these commands output with the pipes might be the problem too.

|from datamodel:Authentication.Authentication  
| search action="failure" 
| timechart span=1h count as num_failures by user  
| stats avg(num_failures) as avg_fail_num 
| trendline sma5(avg_fail_num) as moving_avg_failures
| eval score = (avg_fail_num/(moving_avg))  
| table user, avg_fail_num, moving_avg, score  
| sort – score

The score variable is supposed to increase the larger the fail_num is compared to the moving_avg -- which should show me a confidence score on spikes. This should help me quantify it also for more analysis opportunities.

Also, I should clarify that I want this to detect users who specifically have activity not like their normal activity, and also when failed logins go over a certain number. In other words, not the outliers in the big picture of failed logins but rather when a user is acting weird and there is a huge increase in failed logins for that specific user.

I want to be able to apply this query's structure to other situations.

Here are some of my other iterations/attempts at trying to do this (all with separate issues):

Using bin to get average per hour:

|from datamodel:Authentication.Authentication   
| search action="failure"  
| bin _time span=1h  
| stats count as fail_num by user, _time 
| stats avg(fail_num) as avg_fail_num by user  
| trendline sma24(avg_fail_num) as moving_avg 
| eval moving_avg=moving_avg*3 
| eval score = (avg_fail_num/(moving_avg))  
| table user, _time, fail_num, avg_fail_num, moving_avg, score  
| sort – score

Making a time variable to separate hours:

|from datamodel:Authentication.Authentication   
| search action="failure" 
| regex user="^([^\s]*)$"    
| eval date_hour=strftime(_time,"%H") 
| stats count as fail_num by user, date_hour 
| stats avg(fail_num) as avg_fail_num by user, date_hour 
| trendline sma24(avg_fail_num) as moving_avg 
| eval moving_avg=moving_avg*1 
| eval score = (avg_fail_num/(moving_avg))  
| table user, date_hour, fail_num, avg_fail_num, moving_avg, score  
| sort – score

Detecting Spikes/Anomalies in Failed Logins - over time

stats

table

timechart

Join Us for Splunk University and Get Your Bootcamp Game On!

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

Announcing Scheduled Export GA for Dashboard Studio