Alerting

Consolidated Alerting using TOP command

newbie2tech
Communicator

enter code hereHi Team,

We have an application platform which is gateway for multiple applications[1000+ apps], we have ingested platform logs in splunk.

Platform team wants to setup alerts for each of the applications if the number of HTTP 500 requests breaches the threshold say 5% of the total requests for that app. Alert schedule is every 5 mins. Alert should be sent to corresponding app team as there are more than 1000+ apps, platform team cannot handle.

I am using the top command with limit=0 , this is giving the total events across all the applications in last 5 mins but i want to be able treat the count of specific application as 100 and then calculate the 500 requests % and check if the threshold is violated.

my current query

index=abc | top HttpStatus app limit=0

This is how my current query data looks like , App1+App2 adds up to 100%

HttpStatus  app        count    percent
400         App1        30      0.091609
200         App1        15      0.045804
500         App1        6       0.018322
200         App2        3813    11.643459
400         App2        2       0.006107
500         App2        28882   88.194699

This is how i want the output to be able to trigger at app level. app1 breakup sums up to 100%,likewise App2 breakup sums up to 100%. Then i can link it to lookup and check if the 500 requests have exceeded the threshold % and then trigger the alert.

HttpStatus  app        count    percent
400         App1        30      58.82352941
200         App1        15      29.41176471
500         App1        6       11.76470588
200         App2        3813    11.66162033
400         App2        2       0.006116769
500         App2        28882   88.3322629

I want to use single consolidated alert which can trigger for each applications threshold violation[i plan to use lookup for thresholds] as we do not want to set up 1000 odd individual alerts specific to application

Can you share your thoughts if there is a way to achieve this or something similar.

Thanks!

0 Karma
1 Solution

micahkemp
Champion

Try:

index=abc | top HttpStatus app limit=0 | eventstats sum(count) AS app_total BY app | eval app_percent=(count/app_total)*100

View solution in original post

0 Karma

micahkemp
Champion

Try:

index=abc | top HttpStatus app limit=0 | eventstats sum(count) AS app_total BY app | eval app_percent=(count/app_total)*100
0 Karma

newbie2tech
Communicator

thank you micahkemp, it worked like charm.

0 Karma
Get Updates on the Splunk Community!

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...

Getting Started with AIOps: Event Correlation Basics and Alert Storm Detection in ...

Getting Started with AIOps:Event Correlation Basics and Alert Storm Detection in Splunk IT Service ...

Register to Attend BSides SPL 2022 - It's all Happening October 18!

Join like-minded individuals for technical sessions on everything Splunk!  This is a community-led and run ...