Splunk Search

Calculating 5xx error and getting alerted

ksharma7
Path Finder

Hi all,

Well i have a data and i want to get alerted when we hav spike in 5xx errors corresponding to endpoints. All endpoints have different trend of 5xx errors in general. And traffic also is variable depending on day and night. So if traffic is more than we will probably see more 5xx compared to the one when traffic is low.

I tried to use interquartile range method to check for outliers but I doubt the usage of that as when traffic is going to increase them it will alert without any reason.

Is there any other apt way to do that.

Index= rxc sourcetype=rxcapp status=5* endpoint=*| stats count as error by status
endpoint

This would be my base query
I tried using below but then traffic is variable so in morning time there will be a little more errors than at night in such case this alert is always going to trigger and create spam
Is there ny better way to work with dynamic thresholds like may be calculating percentage change but then how to decide threshold in that case

Index= rxc sourcetype=rxcapp status=5* endpoint=* earliest=-20m@m latest=now| bucket _time span=2m|stats count as error by_time status endpoint| streamstats median (error) as med p75(error) as p75 p25(error) as p245 by status endpoint| eval iqr=(p75-p25)| eval lower=(med-iqr*1.5) | eval upper=(med+iqr*1.5)| where error>upper| fields _time endpoint error status upper lower med iqr

Tags (2)
0 Karma

DalJeanis
SplunkTrust
SplunkTrust

There are lots of ways to do this, and you probably are not going to want to do it based on a rolling window.

Typically, what we do in this situation is to create a summary index and keep track of the typical rates for any given time of day (and any other characteristics you want to consider, such as weekends or holidays.)

So, let's suppose we use a metric index for this, and we have three dimensions - the endpoint name, a weekday/weekend flag, and the hour of the day.

Every night, you load the summary data for the prior day into the metric index, and right after that you can calculate the p90 for each endpoint for the last 30-90 days for each hour and each weekday/weekend flag, then putting the result into a lookup.

Thus, you can just read the lookup to find out what your daily threshold is.

0 Karma

ksharma7
Path Finder

Will something like prediction won't work in this case ?

0 Karma

ksharma7
Path Finder

If the way I was trying to do using Interquartile way, is there a way or what modification in query is needed if I want to run separately for business hours and non -business hours and what do you think is this gonna solve my problem. Like the way I am trying to do , is that the right approach or total wrong approach

0 Karma

ksharma7
Path Finder

Also the way you are telling me , can you suggest a sample query to do that

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!