Hi, I am trying to find the outliers in a specific set of data (a count of login failures within 5 minutes). I have created and assigned the variable residual, which is the prediction (using algorithm LLP) minus the count. I have then cut out all the positive values, because I only want to catch large differences.
What I am trying to do is figure out how to correspond what I have been doing with the Detect Numeric Outliers Assistant into a model I'm creating. Basically, how do I move an adjust the threshold so that it only catches a few outliers?
I have already tried to assign value to threshold and well as lower_threshold and upper_threshold, but it only shades the areas around the data set. Currently I have recreated what I was in the Detect Numeric Outliers Assistant and copy/pasted the SPL from that to see if I can assign it to the algorithm. Below is my example:
1| inputlookup loginfailures_count_5m.csv
2| eval _time=strptime(_time,"%Y-%m-%dT%H:%M:%S.%Q") #lines 1 and 2 are the csv file
3| predict count as prediction algorithm=LLP future_timespan=150 holdback=0
4| where prediction!="" AND count!=""
5| eval residual = prediction - count #lines 3 - 5 are setting what perdiction and residual do
6| eval residual = if(residual < 0, residual, 0) #this line gets rid of all positive values, we only want negative values
7| eventstats avg("residual") as avg stdev("residual") as stdev
8| eval lowerBound=(avg-stdev*exact(6)), upperBound=(avg+stdev*exact(6))
9| eval isOutlier(residual)=if('residual' < lowerBound OR 'residual' > upperBound, 1, 0) #lines 7 - 9 are the SPL from the Detect Numeric Outliers Assistant, may not be the "answer"
10| fit DensityFunction residual show_options="feature_variables" into my_model #have tried different settings here, unsucessful, this is where we feel the problem lies
11| apply my_model
My DensityFunction Outliers graph looks like this:
but I need it to look like this (populated with the Detect Numeric Outliers Assistant):
Any pointers/settings I haven't tried yet? Does anyone know a 1 to 1 correlation between the settings on the Assistant that translate into the DensityFunction (i.e.- what setting does a "sliding window")? Have looked through the docs here to no avail.