eval with mathematical calculations?

tfitzgerald15 · ‎10-04-2013

I'm trying to do something a little wonky here, so please bear with me. The code below is the logical flow of what I'm trying to accomplish. However I know for a fact it won't work. Can you give me some insight?

... | eval threshold=(outlier action=rm param=10 (stdev(count)))

Basically I'm trying to create a field, "threshold", which calculates the standard deviation of my results, removing any outliers prior to calculating the standard deviation. (Basically, I don't want that one result of 7,000 skewing my standard deviation upwards when normally it would have been, say, 10).

jhupka · ‎10-04-2013

Is this what you're looking for...here's a search that does what I think you are asking for on indexer lag (_indextime-_time). So if this does what you're looking for you'll just need to modify to fit your search/data:

index=_internal | eval indexer_lag =_indextime - _time 
| eventstats p25(indexer_lag) as q1, p75(indexer_lag) as q3 | eval iqr=q3-q1 | eval threshold=10*iqr 
| where indexer_lag < threshold 
| eventstats stdev(indexer_lag) as threshold_stddev

I'll explain each part:

index=_internal | eval indexer_lag =_indextime - _time

^^^ Calc our index lag for each event

 | eventstats p25(indexer_lag) as q1, p75(indexer_lag) as q3 | eval iqr=q3-q1 | eval threshold=10*iqr

^^^ Now use eventstats to get our q1, q3 in-line with our events, then calc our interquartile range, and our threshold based on your choosing of 10*iqr from your original post.

| where indexer_lag < threshold

^^^ Only keep events that have lag less than our threshold. e.g. remove our 10*iqr outliers

| eventstats stdev(indexer_lag) as threshold_stddev

^^^ Finally use eventstats again to calculate the new standard deviation in-line based on our new list of events.

jhupka · ‎10-07-2013

So you don't need the indexer_lag field, per se, but your overall search will be similar. If you're looking at specific sourcetype over all indexes, then your search may start like this:

index=* sourcetype=tfitzgerald15s_type |

And the indexer_lag is just a new field I am calculating based on what I want to base my threshold on for the example. So in your case it might be a calculation you have to do for CPU usage, or HTTP response times, or transaction duration.

Also, if this answers your question don't forget to accept/up-vote the answer 🙂

tfitzgerald15 · ‎10-07-2013

Awesome, thanks! I do just have one question. I'm not pointing to a specific indexer, I'm looking at a specific sourcetype. Would I still need the indexer_lag, and what does that represent? Apologies for the admitted newbie question there.

eval with mathematical calculations?

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)