Looking for the exact query to find outliers or anomalies in my csv data using stddev in Splunk enterprise?
Fields from csv: user, action, src, dest, host, _time
Any help would be appreciated.
Thanks in advance!
It is not possible to give you an "exact query" because you haven't provided sufficient detail as to what you are measuring.
I'm trying to measure login count or unusual number of logins from particular source.
Still insufficient detail for an "exact query", so I will make some assumptions
``` Load your data ```
| inputlookup your.csv
``` Use hourly timeslices ```
| bin _time span=1h
``` Only keep login actions ```
| where action="LOGIN"
``` Count events by hour and source ```
| stats count by _time src
``` Find mean and standard deviation ```
| eventstats avg(count) as avg stddev(count) as stddev by src
``` Find deviation from mean in terms of standard deviation ```
| eval deviation=(count-avg)/stddev
``` Keep hours with sources deviating from their mean by more than 2 standard deviations ```
| where abs(deviation) > 2