Splunk Search

Calculating stdev by individual users

danataylor
Engager

Hi,

I'm trying to build the following logic and failing: For each user in my Windows Event Logs, calculate the stdev and boundaries for the distinct count (averaged daily) of servers logged into, for each specific user. I would then theoretically set an alert to yell when any user reaches above their threshold.

I have read the "Finding and removing outliers" doc, but that seem to allow creating upper and lower limits for each user, or "by user", etc. I've tried to modify that information to fit this model and failed. Maybe I'm not understanding it correctly. My attempts look generally like this:

| eventstats dc(dest_nt_host) as new_dc, avg(new_dc) as new_avg, stdev(new_avg) by user as new_stdev
| eval upper = new_avg+(new_stdev*2)
| eval lower = new_avg-(new_stdev*2)    

Any advice or guidance on this problem would be greatly appreciated!

0 Karma

danataylor
Engager

Edited to clarify that I am seeking to get stdev of the distinct count (averaged daily) of servers logged into, by each individual user.

0 Karma

Vijeta
Influencer

It should be as and then by user

| eventstats dc(dest_nt_host) as new_dc, avg(new_dc) as new_avg, stdev(new_avg)  as new_stdev by user
 | eval upper = new_avg+(new_stdev*2)
 | eval lower = new_avg-(new_stdev*2)   
0 Karma

danataylor
Engager

Thank you! However, this search is still not able to calculate thresholds per specific uesrs.

0 Karma

Vijeta
Influencer

Can you please share some sample event. Also why are you using eventstats, can't it be done by stats?

0 Karma

danataylor
Engager

It's generic Windows Event Logs, where dest_nt_host is a value present in each log. No specific reason for using eventstats over stats. I've seen some examples using streamstats, but that doesn't give me output for upper or lower either.

0 Karma

Vijeta
Influencer

Did you try using stats? It should give you upper and lower values.

0 Karma

Vijeta
Influencer

I just realized you are performing stats on the as field (avg(new_dc) defined in same eventstats) which will return blank.

0 Karma

danataylor
Engager

How else would you perform recursive stats operations? Separate lines (new stats command) for each step?

0 Karma

danataylor
Engager

Also, I suppose I'm missing a way for this to determine the average on a daily basis over a long period of time. That's why I was considering streamstats, for time_window=1d.

0 Karma

Vijeta
Influencer

May be something like this

| streamstats dc(dest_nt_host) as new_dc by User| eventstats  avg(new_dc) as new_avg by user|eventstats stdev(new_avg)  as new_stdev by user
  | eval upper = new_avg+(new_stdev*2)
  | eval lower = new_avg-(new_stdev*2)   
0 Karma

danataylor
Engager

I should note that I am seeking stdev of distinct count (averaged daily) of servers logged into, by each user. I edited my OP to reflect this

0 Karma

Vijeta
Influencer

I haven't tried the search, but something on below lines should work

<yoursearch>| bin span=1d _time| eventstats dc(dest_nt_host) as distinct_host by _time user|eventstats dc(_time) as count1| eval avg=distinct_host/count1| stats values(avg) as new_avg, stdev(avd) as new_stdev by user|  eval upper = new_avg+(new_stdev*2)
   | eval lower = new_avg-(new_stdev*2)   
0 Karma

danataylor
Engager
0 Karma

danataylor
Engager

I tried and it did not return values for those

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!