I'm trying to do something a little wonky here, so please bear with me. The code below is the logical flow of what I'm trying to accomplish. However I know for a fact it won't work. Can you give me some insight?
... | eval threshold=(outlier action=rm param=10 (stdev(count)))
Basically I'm trying to create a field, "threshold", which calculates the standard deviation of my results, removing any outliers prior to calculating the standard deviation. (Basically, I don't want that one result of 7,000 skewing my standard deviation upwards when normally it would have been, say, 10).
Is this what you're looking for...here's a search that does what I think you are asking for on indexer lag (_indextime-_time). So if this does what you're looking for you'll just need to modify to fit your search/data:
index=_internal | eval indexer_lag =_indextime - _time
| eventstats p25(indexer_lag) as q1, p75(indexer_lag) as q3 | eval iqr=q3-q1 | eval threshold=10*iqr
| where indexer_lag < threshold
| eventstats stdev(indexer_lag) as threshold_stddev
I'll explain each part:
index=_internal | eval indexer_lag =_indextime - _time
^^^ Calc our index lag for each event
| eventstats p25(indexer_lag) as q1, p75(indexer_lag) as q3 | eval iqr=q3-q1 | eval threshold=10*iqr
^^^ Now use eventstats to get our q1, q3 in-line with our events, then calc our interquartile range, and our threshold based on your choosing of 10*iqr from your original post.
| where indexer_lag < threshold
^^^ Only keep events that have lag less than our threshold. e.g. remove our 10*iqr outliers
| eventstats stdev(indexer_lag) as threshold_stddev
^^^ Finally use eventstats again to calculate the new standard deviation in-line based on our new list of events.
So you don't need the indexer_lag field, per se, but your overall search will be similar. If you're looking at specific sourcetype over all indexes, then your search may start like this:
index=* sourcetype=tfitzgerald15s_type |
And the indexer_lag is just a new field I am calculating based on what I want to base my threshold on for the example. So in your case it might be a calculation you have to do for CPU usage, or HTTP response times, or transaction duration.
Also, if this answers your question don't forget to accept/up-vote the answer 🙂
Awesome, thanks! I do just have one question. I'm not pointing to a specific indexer, I'm looking at a specific sourcetype. Would I still need the indexer_lag, and what does that represent? Apologies for the admitted newbie question there.