Deployment Architecture

How can we detect excessive overlapping alerts?

ddrillic
Ultra Champion

We reach situations in which application teams set their alerts at the top of the hour and when we (the Splunk team) catch it, it might be too late.

Is there a way to produce a report which lists the run times and detect excessive usage times?

0 Karma
1 Solution

skoelpin
SplunkTrust
SplunkTrust

Yeah, you can use the internal index for this. You should explicitly add savedsearch_name for this

index=_internal savedsearch_name=*
| timechart max(run_time) AS run_time by savedsearch_name

View solution in original post

skoelpin
SplunkTrust
SplunkTrust

Yeah, you can use the internal index for this. You should explicitly add savedsearch_name for this

index=_internal savedsearch_name=*
| timechart max(run_time) AS run_time by savedsearch_name

ddrillic
Ultra Champion

Thank you @skoelpin.

I changed the max to sum and we can see -

alt text

We can see that at each quarter of the hour we have peak usage.
Can we find out from _internal how many searches were skipped?

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Yes, you sure can!

index=_internal sourcetype=scheduled status=skipped NOT "_ACCELERATE*"
| timechart count by savedsearch_name
0 Karma

ddrillic
Ultra Champion

Just ran -

index=_internal sourcetype=scheduler status=skipped NOT "_ACCELERATE*"
 | timechart count

It shows -

alt text

0 Karma

ddrillic
Ultra Champion

The totals for an hour are -

alt text

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Yeah, you have a problem with skips at 4am. You should trend this over time by using timewrap to see if there's a pattern. Most likely, other searches are competing for resources and they run long and cause skips. You can fix this by changing search priroty away from 0 to auto.

You can split by savedsearch_name or get a total over a span of time by adding span=1h. We use this search to alert us and cut a ticket when we start skipping. Skips are unacceptable for us

ddrillic
Ultra Champion

Much appreciated @skoelpin.

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...