All Apps and Add-ons

Splunk Machine Learning Toolkit: How to display Outliers Chart?

lradics
Path Finder

I'm working with the Detect Numeric Outliers assistant from the Splunk Machine Learning Toolkit 2.3.0. When I use the Toolkit's built-in data in the Showcase, I get a lovely Outliers Chart-style visualization in the "Visualizations" tab. The SPL used is as follows:

| inputlookup supermarket.csv | head 1000 | eventstats avg("quantity") as avg stdev("quantity") as stdev  | eval lowerBound=(avg-stdev*exact(5)), upperBound=(avg+stdev*exact(5)) | eval isOutlier=if('quantity' < lowerBound OR 'quantity' > upperBound, 1, 0)  | fields _time, "quantity", lowerBound, upperBound, isOutlier, *

When I try to use my own data in the assistant, I get an empty Visualizations tab with the message "Your search isn't generating any statistic or visualization results." (I'm running the search in Smart Mode both times.) The SPL I'm using is:

index=xxx source=xxx reactionTime | eventstats avg("reactionTime") as avg stdev("reactionTime") as stdev | eval lowerBound=(avg-stdev*exact(2)), upperBound=(avg+stdev*exact(2)) | eval isOutlier=if('reactionTime' < lowerBound OR 'reactionTime' > upperBound, 1, 0) | fields _time, "reactionTime", lowerBound, upperBound, isOutlier, *   

When I replace the fields above with table, I get a graph in the Visualizations tab, but the graph is blank with the error message "No data to Display."

Adding to my confusion is the fact that the documentation for the table command says not to use it for charts, as it strips away the internal fields, but the Toolkit's User Guide says to use the syntax | table _time, outlier_variable, lowerBound, upperBound for the Outliers Chart.

So now I'm stuck - I'd like to display an Outliers Chart with my own data, but I don't know how to do that. Has anyone run into this problem before, or can anyone point me to where I'm doing something wrong?

Thank you!

0 Karma
1 Solution

cmerriman
Super Champion

try this:

index=xxx source=xxx reactionTime |sort 0 _time| eventstats avg("reactionTime") as avg stdev("reactionTime") as stdev | eval lowerBound=(avg-stdev*exact(2)), upperBound=(avg+stdev*exact(2)) | eval isOutlier=if('reactionTime' < lowerBound OR 'reactionTime' > upperBound, 1, 0) | table _time, "reactionTime", lowerBound, upperBound, isOutlier, *   

View solution in original post

cmerriman
Super Champion

try this:

index=xxx source=xxx reactionTime |sort 0 _time| eventstats avg("reactionTime") as avg stdev("reactionTime") as stdev | eval lowerBound=(avg-stdev*exact(2)), upperBound=(avg+stdev*exact(2)) | eval isOutlier=if('reactionTime' < lowerBound OR 'reactionTime' > upperBound, 1, 0) | table _time, "reactionTime", lowerBound, upperBound, isOutlier, *   

lradics
Path Finder

That worked! Thank you, thank you so much 🙂 I'd been staring at it for so long, but I'd never have guessed to try that without your help.

0 Karma

cmerriman
Super Champion

no problem, glad it worked. wouldn't have thought about it until you mentioned that the timestamp was your latest time. I realized you probably needed to sort it. generally you'd do a stats/timechart/etc. before doing something else and it would sort it automatically, but this works all the same.

0 Karma

cmerriman
Super Champion

are you seeing data in the statistics tab? are all of your fields populated correctly?

try to add |table _time quantity before the eventstats to see if it helps.

0 Karma

niketn
Legend

@Iradics, since you are having _time in your table you dont have to worry that you are trying to visualize a data using table. You have retained the required internal field _time.

However, No results Found implies you do not have data available as per your query. As cmerriman has suggested just try base search first. Ensure that reactionTime field (case-sensitive), is present and has numerical value):

 index=xxx source=xxx reactionTime=* 
 |table _time reactionTime
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

lradics
Path Finder

@niketnilay and @cmerriman, thank you for your responses! Unfortunately I tried both and am getting the same result. I am seeing data in the statistics tab, and in the bottom half of the visualization tab as well - just not in my graph. (Oddly, the graph's legend displays a number of outliers - "26 outliers" - but the graph itself remains blank.)

I'm wondering if it could be related to my time window somehow? I set it to "Last 7 Days," but I noticed that the leftmost point along my x-axis is labelled 09:52:32.309 (which is the newest data point I had when I ran the search). Do you know how I could change that?

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...